<legend id='m3Q4J'><style id='m3Q4J'><dir id='m3Q4J'><q id='m3Q4J'></q></dir></style></legend>

      <tfoot id='m3Q4J'></tfoot>
      1. <small id='m3Q4J'></small><noframes id='m3Q4J'>

        <i id='m3Q4J'><tr id='m3Q4J'><dt id='m3Q4J'><q id='m3Q4J'><span id='m3Q4J'><b id='m3Q4J'><form id='m3Q4J'><ins id='m3Q4J'></ins><ul id='m3Q4J'></ul><sub id='m3Q4J'></sub></form><legend id='m3Q4J'></legend><bdo id='m3Q4J'><pre id='m3Q4J'><center id='m3Q4J'></center></pre></bdo></b><th id='m3Q4J'></th></span></q></dt></tr></i><div id='m3Q4J'><tfoot id='m3Q4J'></tfoot><dl id='m3Q4J'><fieldset id='m3Q4J'></fieldset></dl></div>
          <bdo id='m3Q4J'></bdo><ul id='m3Q4J'></ul>

        Python中的XML到CSV

        XML to CSV in Python(Python中的XML到CSV)
        <legend id='15bXc'><style id='15bXc'><dir id='15bXc'><q id='15bXc'></q></dir></style></legend>
      2. <small id='15bXc'></small><noframes id='15bXc'>

              <tfoot id='15bXc'></tfoot>
                <bdo id='15bXc'></bdo><ul id='15bXc'></ul>

                  <tbody id='15bXc'></tbody>
                • <i id='15bXc'><tr id='15bXc'><dt id='15bXc'><q id='15bXc'><span id='15bXc'><b id='15bXc'><form id='15bXc'><ins id='15bXc'></ins><ul id='15bXc'></ul><sub id='15bXc'></sub></form><legend id='15bXc'></legend><bdo id='15bXc'><pre id='15bXc'><center id='15bXc'></center></pre></bdo></b><th id='15bXc'></th></span></q></dt></tr></i><div id='15bXc'><tfoot id='15bXc'></tfoot><dl id='15bXc'><fieldset id='15bXc'></fieldset></dl></div>

                • 本文介绍了Python中的XML到CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我在 Python 中将 XML 文件转换为 CSV 时遇到了很多麻烦.我查看了很多论坛,尝试了 lxml 和 xmlutils.xml2csv,但我无法让它工作.这是来自 Garmin GPS 设备的 GPS 数据.

                  I'm having a lot of trouble converting an XML file to a CSV in Python. I've looked at many forums, tried both lxml and xmlutils.xml2csv, but I can't get it to work. It's GPS data from a Garmin GPS device.

                  这是我的 XML 文件的样子,当然是缩短的:

                  Here's what my XML file looks like, shortened of course:

                  <?xml version="1.0" encoding="utf-8"?>
                  <gpx xmlns:tc2="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:tp1="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns="http://www.topografix.com/GPX/1/1" version="1.1" creator="TC2 to GPX11 XSLT stylesheet" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd">
                    <trk>
                        <name>2013-12-03T21:08:56Z</name>
                        <trkseg>
                            <trkpt lat="45.4852855" lon="-122.6347885">
                                <ele>0.0000000</ele>
                                <time>2013-12-03T21:08:56Z</time>
                            </trkpt>
                            <trkpt lat="45.4852961" lon="-122.6347926">
                                <ele>0.0000000</ele>
                                <time>2013-12-03T21:09:00Z</time>
                            </trkpt>
                            <trkpt lat="45.4852982" lon="-122.6347897">
                                <ele>0.2000000</ele>
                                <time>2013-12-03T21:09:01Z</time>
                            </trkpt>
                        </trkseg>
                    </trk>
                  </gpx>
                  

                  在我庞大的 XML 文件中有几个 trk 标签,但我可以设法将它们分开——它们代表 GPS 设备上的不同段"或行程.我想要的只是一个 CSV 文件,它可以绘制如下内容:

                  There are several trk tags in my massive XML file, but I can manage to separate them out -- they represent different "segments" or trips on the GPS device. All I want is a CSV file that plots something like this:

                  LAT         LON         TIME         ELE
                  45.4...     -122.6...   2013-12...   0.00...
                  ...         ...         ...          ...
                  

                  这是我到目前为止的代码:

                  Here's the code I have so far:

                  ## Call libraries
                  import csv
                  from xmlutils.xml2csv import xml2csv
                  
                  inputs = "myfile.xml"
                  output = "myfile.csv"
                  
                  converter = xml2csv(inputs, output)
                  converter.convert(tag="WHATEVER_GOES_HERE_RENDERS_EMPTY_CSV")
                  

                  这是另一个替代代码.它只输出一个没有数据的 CSV 文件,只有标题 latlon.

                  This is another alternative code. It merely outputs a CSV file with no data, just the headers lat and lon.

                  import csv
                  import lxml.etree
                  
                  x = '''
                  <?xml version="1.0" encoding="utf-8"?>
                  <gpx xmlns:tc2="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:tp1="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns="http://www.topografix.com/GPX/1/1" version="1.1" creator="TC2 to GPX11 XSLT stylesheet" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd">
                  <trk>
                    <name>2013-12-03T21:08:56Z</name>
                    <trkseg>
                      <trkpt lat="45.4852855" lon="-122.6347885">
                        <ele>0.0000000</ele>
                        <time>2013-12-03T21:08:56Z</time>
                      </trkpt>
                      <trkpt lat="45.4852961" lon="-122.6347926">
                        <ele>0.0000000</ele>
                        <time>2013-12-03T21:09:00Z</time>
                      </trkpt>
                      <trkpt lat="45.4852982" lon="-122.6347897">
                        <ele>0.2000000</ele>
                        <time>2013-12-03T21:09:01Z</time>
                      </trkpt>
                    </trkseg>
                  </trk>
                  </gpx>
                  '''
                  
                  with open('output.csv', 'w') as f:
                      writer = csv.writer(f)
                      writer.writerow(('lat', 'lon'))
                      root = lxml.etree.fromstring(x)
                      for trkpt in root.iter('trkpt'):
                          row = trkpt.get('lat'), trkpt.get('lon')
                          writer.writerow(row)
                  

                  我该怎么做?请意识到我是新手,所以更全面的解释会非常棒!

                  How do I do this? Please realize I'm a novice, so a more comprehensive explanation would be super awesome!

                  推荐答案

                  这是一个命名空间 XML 文档.因此,您需要使用它们各自的命名空间来寻址节点.

                  This is a namespaced XML document. Therefore you need to address the nodes using their respective namespaces.

                  文档中使用的命名空间定义在顶部:

                  The namespaces used in the document are defined at the top:

                  xmlns:tc2="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                  xmlns:tp1="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
                  xmlns="http://www.topografix.com/GPX/1/1"
                  

                  所以第一个命名空间被映射到短格式tc2,并且会被用在像<tc2:foobar/>这样的元素中.最后一个,在 xmlns 之后没有简短的形式,称为 default namespace,它适用于文档中没有明确表示的所有元素使用命名空间 - 所以它也适用于您的 <trkpt/> 元素.

                  So the first namespace is mapped to the short form tc2, and would be used in an element like <tc2:foobar/>. The last one, which doesn't have a short form after the xmlns, is called the default namespace, and it applies to all elements in the document that don't explicitely use a namespace - so it applies to your <trkpt /> elements as well.

                  因此,您需要编写 root.iter('{http://www.topografix.com/GPX/1/1}trkpt') 来选择这些元素.

                  Therefore you would need to write root.iter('{http://www.topografix.com/GPX/1/1}trkpt') to select these elements.

                  为了同时获得时间和海拔,您可以使用 trkpt.find() 访问 trkpt 节点下的这些元素,然后使用 元素.text 来检索这些元素的文本内容(与 latlon 等属性相反).此外,由于 timeele 元素也使用默认命名空间,因此您必须再次使用 {namespace}element 语法来选择那些节点.

                  In order to also get time and elevation, you can use trkpt.find() to access these elements below the trkpt node, and then element.text to retrieve those elements' text content (as opposed to attributes like lat and lon). Also, because the time and ele elements also use the default namespace you'll have to use the {namespace}element syntax again to select those nodes.

                  所以你可以使用这样的东西:

                  So you could use something like this:

                  NS = 'http://www.topografix.com/GPX/1/1'
                  header = ('lat', 'lon', 'ele', 'time')
                  
                  with open('output.csv', 'w') as f:
                      writer = csv.writer(f)
                      writer.writerow(header)
                      root = lxml.etree.fromstring(x)
                      for trkpt in root.iter('{%s}trkpt' % NS):
                          lat = trkpt.get('lat')
                          lon = trkpt.get('lon')
                          ele = trkpt.find('{%s}ele' % NS).text
                          time = trkpt.find('{%s}time' % NS).text
                  
                          row = lat, lon, ele, time
                          writer.writerow(row)
                  

                  有关 XML 命名空间的更多信息,请参阅 lxml 教程中的 命名空间部分和 关于 XML 命名空间的维基百科文章.另请参阅 GPS eXchange 格式,了解有关 .gpx 格式的一些详细信息.

                  For more information on XML namespaces, see the Namespaces section in the lxml tutorial and the Wikipedia article on XML Namespaces. Also see GPS eXchange Format for some details on the .gpx format.

                  这篇关于Python中的XML到CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Adding config modes to Plotly.Py offline - modebar(将配置模式添加到 Plotly.Py 离线 - 模式栏)
                  Plotly: How to style a plotly figure so that it doesn#39;t display gaps for missing dates?(Plotly:如何设置绘图图形的样式,使其不显示缺失日期的间隙?)
                  python save plotly plot to local file and insert into html(python将绘图保存到本地文件并插入到html中)
                  Plotly: What color cycle does plotly express follow?(情节:情节表达遵循什么颜色循环?)
                  How to save plotly express plot into a html or static image file?(如何将情节表达图保存到 html 或静态图像文件中?)
                  Plotly: How to make a line plot from a pandas dataframe with a long or wide format?(Plotly:如何使用长格式或宽格式的 pandas 数据框制作线图?)
                  <i id='BK1rq'><tr id='BK1rq'><dt id='BK1rq'><q id='BK1rq'><span id='BK1rq'><b id='BK1rq'><form id='BK1rq'><ins id='BK1rq'></ins><ul id='BK1rq'></ul><sub id='BK1rq'></sub></form><legend id='BK1rq'></legend><bdo id='BK1rq'><pre id='BK1rq'><center id='BK1rq'></center></pre></bdo></b><th id='BK1rq'></th></span></q></dt></tr></i><div id='BK1rq'><tfoot id='BK1rq'></tfoot><dl id='BK1rq'><fieldset id='BK1rq'></fieldset></dl></div>
                  <tfoot id='BK1rq'></tfoot>

                      <tbody id='BK1rq'></tbody>
                    • <bdo id='BK1rq'></bdo><ul id='BK1rq'></ul>
                      <legend id='BK1rq'><style id='BK1rq'><dir id='BK1rq'><q id='BK1rq'></q></dir></style></legend>

                      <small id='BK1rq'></small><noframes id='BK1rq'>