<i id='FrwOr'><tr id='FrwOr'><dt id='FrwOr'><q id='FrwOr'><span id='FrwOr'><b id='FrwOr'><form id='FrwOr'><ins id='FrwOr'></ins><ul id='FrwOr'></ul><sub id='FrwOr'></sub></form><legend id='FrwOr'></legend><bdo id='FrwOr'><pre id='FrwOr'><center id='FrwOr'></center></pre></bdo></b><th id='FrwOr'></th></span></q></dt></tr></i><div id='FrwOr'><tfoot id='FrwOr'></tfoot><dl id='FrwOr'><fieldset id='FrwOr'></fieldset></dl></div>
    <tfoot id='FrwOr'></tfoot><legend id='FrwOr'><style id='FrwOr'><dir id='FrwOr'><q id='FrwOr'></q></dir></style></legend>

      <small id='FrwOr'></small><noframes id='FrwOr'>

          <bdo id='FrwOr'></bdo><ul id='FrwOr'></ul>
      1. 使用 lxml 按属性查找元素

        finding elements by attribute with lxml(使用 lxml 按属性查找元素)
      2. <small id='nFMxH'></small><noframes id='nFMxH'>

          • <bdo id='nFMxH'></bdo><ul id='nFMxH'></ul>
              <legend id='nFMxH'><style id='nFMxH'><dir id='nFMxH'><q id='nFMxH'></q></dir></style></legend>

              <i id='nFMxH'><tr id='nFMxH'><dt id='nFMxH'><q id='nFMxH'><span id='nFMxH'><b id='nFMxH'><form id='nFMxH'><ins id='nFMxH'></ins><ul id='nFMxH'></ul><sub id='nFMxH'></sub></form><legend id='nFMxH'></legend><bdo id='nFMxH'><pre id='nFMxH'><center id='nFMxH'></center></pre></bdo></b><th id='nFMxH'></th></span></q></dt></tr></i><div id='nFMxH'><tfoot id='nFMxH'></tfoot><dl id='nFMxH'><fieldset id='nFMxH'></fieldset></dl></div>
                • <tfoot id='nFMxH'></tfoot>
                    <tbody id='nFMxH'></tbody>
                  本文介绍了使用 lxml 按属性查找元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我需要解析一个 xml 文件来提取一些数据.我只需要一些具有某些属性的元素,这里是一个文档示例:

                  I need to parse a xml file to extract some data. I only need some elements with certain attributes, here's an example of document:

                  <root>
                      <articles>
                          <article type="news">
                               <content>some text</content>
                          </article>
                          <article type="info">
                               <content>some text</content>
                          </article>
                          <article type="news">
                               <content>some text</content>
                          </article>
                      </articles>
                  </root>
                  

                  在这里,我只想获取类型为新闻"的文章.使用 lxml 最有效和最优雅的方法是什么?

                  Here I would like to get only the article with the type "news". What's the most efficient and elegant way to do it with lxml?

                  我尝试了 find 方法,但不是很好:

                  I tried with the find method but it's not very nice:

                  from lxml import etree
                  f = etree.parse("myfile")
                  root = f.getroot()
                  articles = root.getchildren()[0]
                  article_list = articles.findall('article')
                  for article in article_list:
                      if "type" in article.keys():
                          if article.attrib['type'] == 'news':
                              content = article.find('content')
                              content = content.text
                  

                  推荐答案

                  你可以使用xpath,例如root.xpath("//article[@type='news']")

                  You can use xpath, e.g. root.xpath("//article[@type='news']")

                  此 xpath 表达式将返回所有 <article/> 元素的列表,该元素的type"属性值为news".然后,您可以对其进行迭代以执行您想要的操作,或者将其传递到任何地方.

                  This xpath expression will return a list of all <article/> elements with "type" attributes with value "news". You can then iterate over it to do what you want, or pass it wherever.

                  要获取文本内容,您可以像这样扩展 xpath:

                  To get just the text content, you can extend the xpath like so:

                  root = etree.fromstring("""
                  <root>
                      <articles>
                          <article type="news">
                               <content>some text</content>
                          </article>
                          <article type="info">
                               <content>some text</content>
                          </article>
                          <article type="news">
                               <content>some text</content>
                          </article>
                      </articles>
                  </root>
                  """)
                  
                  print root.xpath("//article[@type='news']/content/text()")
                  

                  这将输出 ['some text', 'some text'].或者,如果您只想要内容元素,则可以是 "//article[@type='news']/content" -- 以此类推.

                  and this will output ['some text', 'some text']. Or if you just wanted the content elements, it would be "//article[@type='news']/content" -- and so on.

                  这篇关于使用 lxml 按属性查找元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Initialize Multiple Numpy Arrays (Multiple Assignment) - Like MATLAB deal()(初始化多个 Numpy 数组(多重赋值) - 像 MATLAB deal())
                  How to extend Python class init(如何扩展 Python 类初始化)
                  What#39;s the difference between dict() and {}?(dict() 和 {} 有什么区别?)
                  What is a wrapper_descriptor, and why is Foo.__init__() one in this case?(什么是 wrapper_descriptor,为什么 Foo.__init__() 在这种情况下是其中之一?)
                  Initialize list with same bool value(使用相同的布尔值初始化列表)
                  setattr with kwargs, pythonic or not?(setattr 与 kwargs,pythonic 与否?)
                  <i id='Xpgaw'><tr id='Xpgaw'><dt id='Xpgaw'><q id='Xpgaw'><span id='Xpgaw'><b id='Xpgaw'><form id='Xpgaw'><ins id='Xpgaw'></ins><ul id='Xpgaw'></ul><sub id='Xpgaw'></sub></form><legend id='Xpgaw'></legend><bdo id='Xpgaw'><pre id='Xpgaw'><center id='Xpgaw'></center></pre></bdo></b><th id='Xpgaw'></th></span></q></dt></tr></i><div id='Xpgaw'><tfoot id='Xpgaw'></tfoot><dl id='Xpgaw'><fieldset id='Xpgaw'></fieldset></dl></div>
                • <tfoot id='Xpgaw'></tfoot>

                    <small id='Xpgaw'></small><noframes id='Xpgaw'>

                          <tbody id='Xpgaw'></tbody>
                        <legend id='Xpgaw'><style id='Xpgaw'><dir id='Xpgaw'><q id='Xpgaw'></q></dir></style></legend>

                          • <bdo id='Xpgaw'></bdo><ul id='Xpgaw'></ul>