Python ElementTree 模块:使用“find"、“findall"方法时如何忽略 XM

Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method quot;findquot;, quot;findallquot;(Python ElementTree 模块:使用“find、“findall方法时如何忽略 XML 文件的命名空间来定位匹配
本文介绍了Python ElementTree 模块:使用“find"、“findall"方法时如何忽略 XML 文件的命名空间来定位匹配元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

限时送ChatGPT账号..

我想使用findall"的方法在ElementTree模块中定位源xml文件的一些元素.

I want to use the method of "findall" to locate some elements of the source xml file in the ElementTree module.

但是,源 xml 文件 (test.xml) 具有命名空间.我将 xml 文件的一部分截断为示例:

However, the source xml file (test.xml) has namespace. I truncate part of xml file as sample:

<?xml version="1.0" encoding="iso-8859-1"?>
<XML_HEADER xmlns="http://www.test.com">
    <TYPE>Updates</TYPE>
    <DATE>9/26/2012 10:30:34 AM</DATE>
    <COPYRIGHT_NOTICE>All Rights Reserved.</COPYRIGHT_NOTICE>
    <LICENSE>newlicense.htm</LICENSE>
    <DEAL_LEVEL>
        <PAID_OFF>N</PAID_OFF>
        </DEAL_LEVEL>
</XML_HEADER>

示例python代码如下:

The sample python code is below:

from xml.etree import ElementTree as ET
tree = ET.parse(r"test.xml")
el1 = tree.findall("DEAL_LEVEL/PAID_OFF") # Return None
el2 = tree.findall("{http://www.test.com}DEAL_LEVEL/{http://www.test.com}PAID_OFF") # Return <Element '{http://www.test.com}DEAL_LEVEL/PAID_OFF' at 0xb78b90>

虽然可以,但是因为有命名空间{http://www.test.com}",所以在每个标签前面加命名空间很不方便.

Although it can works, because there is a namespace "{http://www.test.com}", it's very inconvenient to add a namespace in front of each tag.

使用find"、findall"等方法时如何忽略命名空间?

How can I ignore the namespace when using the method of "find", "findall" and so on?

推荐答案

最好先解析XML文档,然后再修改结果中的标签,而不是修改XML文档本身.这样你就可以处理多个命名空间和命名空间别名:

Instead of modifying the XML document itself, it's best to parse it and then modify the tags in the result. This way you can handle multiple namespaces and namespace aliases:

from io import StringIO  # for Python 2 import from StringIO instead
import xml.etree.ElementTree as ET

# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))
for _, el in it:
    prefix, has_namespace, postfix = el.tag.partition('}')
    if has_namespace:
        el.tag = postfix  # strip all namespaces
root = it.root

这是基于这里的讨论:http://bugs.python.org/issue18304

更新: rpartition 而不是 partition 确保您在 postfix 中获得标签名称,即使有没有命名空间.因此,您可以将其浓缩:

Update: rpartition instead of partition makes sure you get the tag name in postfix even if there is no namespace. Thus you could condense it:

for _, el in it:
    _, _, el.tag = el.tag.rpartition('}') # strip ns

这篇关于Python ElementTree 模块:使用“find"、“findall"方法时如何忽略 XML 文件的命名空间来定位匹配元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

How do I make a list of all members in a discord server using discord.py?(如何使用 discord.py 列出不和谐服务器中的所有成员?)
how to change discord.py bot activity(如何更改 discord.py 机器人活动)
Issues with getting VoiceChannel.members and Guild.members to return a full list(让 VoiceChannel.members 和 Guild.members 返回完整列表的问题)
Add button components to a message (discord.py)(将按钮组件添加到消息(discord.py))
on_message() and @bot.command issue(on_message() 和@bot.command 问题)
How to edit a message in discord.py(如何在 discord.py 中编辑消息)