• <legend id='uQHne'><style id='uQHne'><dir id='uQHne'><q id='uQHne'></q></dir></style></legend>
    <tfoot id='uQHne'></tfoot>

      <small id='uQHne'></small><noframes id='uQHne'>

        • <bdo id='uQHne'></bdo><ul id='uQHne'></ul>
      1. <i id='uQHne'><tr id='uQHne'><dt id='uQHne'><q id='uQHne'><span id='uQHne'><b id='uQHne'><form id='uQHne'><ins id='uQHne'></ins><ul id='uQHne'></ul><sub id='uQHne'></sub></form><legend id='uQHne'></legend><bdo id='uQHne'><pre id='uQHne'><center id='uQHne'></center></pre></bdo></b><th id='uQHne'></th></span></q></dt></tr></i><div id='uQHne'><tfoot id='uQHne'></tfoot><dl id='uQHne'><fieldset id='uQHne'></fieldset></dl></div>

        如何处理用yfinance下载的多级列名

        How to deal with multi-level column names downloaded with yfinance(如何处理用yfinance下载的多级列名)
      2. <small id='WvbM1'></small><noframes id='WvbM1'>

        <tfoot id='WvbM1'></tfoot>

              <tbody id='WvbM1'></tbody>
              <bdo id='WvbM1'></bdo><ul id='WvbM1'></ul>
            • <i id='WvbM1'><tr id='WvbM1'><dt id='WvbM1'><q id='WvbM1'><span id='WvbM1'><b id='WvbM1'><form id='WvbM1'><ins id='WvbM1'></ins><ul id='WvbM1'></ul><sub id='WvbM1'></sub></form><legend id='WvbM1'></legend><bdo id='WvbM1'><pre id='WvbM1'><center id='WvbM1'></center></pre></bdo></b><th id='WvbM1'></th></span></q></dt></tr></i><div id='WvbM1'><tfoot id='WvbM1'></tfoot><dl id='WvbM1'><fieldset id='WvbM1'></fieldset></dl></div>
              <legend id='WvbM1'><style id='WvbM1'><dir id='WvbM1'><q id='WvbM1'></q></dir></style></legend>

                • 本文介绍了如何处理用yfinance下载的多级列名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有一个代码列表 (tickerStrings),我可以一次性下载.当我尝试使用 pandas 的 read_csv 时,它不会像我从 yfinance 下载数据时那样读取 csv 文件.

                  I have a list of tickers (tickerStrings) that I to download all at once. When I try to use pandas' read_csv it doesn't read the csv file in the way it does when I download the data from yfinance.

                  我通常通过如下代码访问我的数据:data['AAPL']data['AAPL'].Close,但是当我从它不允许我这样做的 csv 文件.

                  I usually access my data by ticker like this: data['AAPL'] or data['AAPL'].Close, but when I read the data from the csv file it does not let me do that.

                  if path.exists(data_file):
                      data = pd.read_csv(data_file, low_memory=False)
                      data = pd.DataFrame(data)
                      print(data.head())
                  else:
                      data = yf.download(tickerStrings, group_by="Ticker", period=prd, interval=intv)
                      data.to_csv(data_file)
                  

                  这是打印输出:

                                    Unnamed: 0                 OLN               OLN.1               OLN.2               OLN.3  ...                 W.1                 W.2                 W.3                 W.4     W.5
                  0                        NaN                Open                High                 Low               Close  ...                High                 Low               Close           Adj Close  Volume
                  1                   Datetime                 NaN                 NaN                 NaN                 NaN  ...                 NaN                 NaN                 NaN                 NaN     NaN
                  2  2020-06-25 09:30:00-04:00    11.1899995803833  11.220000267028809  11.010000228881836  11.079999923706055  ...   201.2899932861328   197.3000030517578  197.36000061035156  197.36000061035156  112156
                  3  2020-06-25 09:45:00-04:00  11.130000114440918  11.260000228881836  11.100000381469727   11.15999984741211  ...  200.48570251464844  196.47999572753906  199.74000549316406  199.74000549316406   83943
                  4  2020-06-25 10:00:00-04:00  11.170000076293945  11.220000267028809  11.119999885559082  11.170000076293945  ...  200.49000549316406  198.19000244140625   200.4149932861328   200.4149932861328   88771
                  

                  我在尝试访问数据时遇到的错误:

                  The error I get when trying to access the data:

                  Traceback (most recent call last):
                  File "getdata.py", line 49, in processData
                      avg = data[x].Close.mean()
                  AttributeError: 'Series' object has no attribute 'Close'
                  

                  推荐答案

                  将所有代码下载到具有单级列标题的单个数据框中

                  选项 1

                  • 下载单个股票代码数据时,返回的数据框列名称是单个级别,但没有代码列.
                  • 这将为每个代码下载数据,添加代码列,并从所有所需代码创建一个数据框.
                  • import yfinance as yf
                    import pandas as pd
                    
                    tickerStrings = ['AAPL', 'MSFT']
                    df_list = list()
                    for ticker in tickerStrings:
                        data = yf.download(ticker, group_by="Ticker", period='2d')
                        data['ticker'] = ticker  # add this column because the dataframe doesn't contain a column with the ticker
                        df_list.append(data)
                    
                    # combine all dataframes into a single dataframe
                    df = pd.concat(df_list)
                    
                    # save to csv
                    df.to_csv('ticker.csv')
                    

                    选项 2

                    • 下载所有代码并取消堆叠级别
                      • group_by='Ticker' 将代码放在列名的level=0
                      • Option 2

                        • Download all the tickers and unstack the levels
                          • group_by='Ticker' puts the ticker at level=0 of the column name
                          • tickerStrings = ['AAPL', 'MSFT']
                            df = yf.download(tickerStrings, group_by='Ticker', period='2d')
                            df = df.stack(level=0).rename_axis(['Date', 'Ticker']).reset_index(level=1)
                            


                            读取 yfinance 已存储多级列名的 csv

                            • 如果您希望保留并读取具有多级列索引的文件,请使用以下代码,该代码会将数据框恢复为其原始形式.

                            • Read yfinance csv already stored with multi-level column names

                              • If you wish to keep, and read in a file with a multi-level column index, use the following code, which will return the dataframe to its original form.
                              • df = pd.read_csv('test.csv', header=[0, 1])
                                df.drop([0], axis=0, inplace=True)  # drop this row because it only has one column with Date in it
                                df[('Unnamed: 0_level_0', 'Unnamed: 0_level_1')] = pd.to_datetime(df[('Unnamed: 0_level_0', 'Unnamed: 0_level_1')], format='%Y-%m-%d')  # convert the first column to a datetime
                                df.set_index(('Unnamed: 0_level_0', 'Unnamed: 0_level_1'), inplace=True)  # set the first column as the index
                                df.index.name = None  # rename the index
                                

                                • 问题是,tickerStrings 是一个代码列表,这会导致最终数据帧具有多级列名
                                  • The issue is, tickerStrings is a list of tickers, which results in a final dataframe with multi-level column names
                                  •                 AAPL                                                    MSFT                                
                                                    Open      High       Low     Close Adj Close     Volume Open High Low Close Adj Close Volume
                                    Date                                                                                                        
                                    1980-12-12  0.513393  0.515625  0.513393  0.513393  0.405683  117258400  NaN  NaN NaN   NaN       NaN    NaN
                                    1980-12-15  0.488839  0.488839  0.486607  0.486607  0.384517   43971200  NaN  NaN NaN   NaN       NaN    NaN
                                    1980-12-16  0.453125  0.453125  0.450893  0.450893  0.356296   26432000  NaN  NaN NaN   NaN       NaN    NaN
                                    1980-12-17  0.462054  0.464286  0.462054  0.462054  0.365115   21610400  NaN  NaN NaN   NaN       NaN    NaN
                                    1980-12-18  0.475446  0.477679  0.475446  0.475446  0.375698   18362400  NaN  NaN NaN   NaN       NaN    NaN
                                    

                                    • 当它被保存到 csv 时,它看起来像下面的示例,并产生一个你遇到问题的数据框.
                                    • ,AAPL,AAPL,AAPL,AAPL,AAPL,AAPL,MSFT,MSFT,MSFT,MSFT,MSFT,MSFT
                                      ,Open,High,Low,Close,Adj Close,Volume,Open,High,Low,Close,Adj Close,Volume
                                      Date,,,,,,,,,,,,
                                      1980-12-12,0.5133928656578064,0.515625,0.5133928656578064,0.5133928656578064,0.40568336844444275,117258400,,,,,,
                                      1980-12-15,0.4888392984867096,0.4888392984867096,0.4866071343421936,0.4866071343421936,0.3845173120498657,43971200,,,,,,
                                      1980-12-16,0.453125,0.453125,0.4508928656578064,0.4508928656578064,0.3562958240509033,26432000,,,,,,
                                      


                                      将多级列扁平化为单级并添加一个ticker列

                                      • 如果股票代码是列名的 level=0(顶部)
                                        • 当使用 group_by='Ticker'

                                        • Flatten multi-level columns into a single level and add a ticker column

                                          • If the ticker symbol is level=0 (top) of the column names
                                            • When group_by='Ticker' is used
                                            • df.stack(level=0).rename_axis(['Date', 'Ticker']).reset_index(level=1)
                                              

                                              • 如果股票代码是列名的level=1(底部)
                                              • df.stack(level=1).rename_axis(['Date', 'Ticker']).reset_index(level=1)
                                                


                                                下载每个代码并将其保存到单独的文件中

                                                • 我建议单独下载并保存每个代码,如下所示:
                                                • import yfinance as yf
                                                  import pandas as pd
                                                  
                                                  tickerStrings = ['AAPL', 'MSFT']
                                                  for ticker in tickerStrings:
                                                      data = yf.download(ticker, group_by="Ticker", period=prd, interval=intv)
                                                      data['ticker'] = ticker  # add this column because the dataframe doesn't contain a column with the ticker
                                                      data.to_csv(f'ticker_{ticker}.csv')  # ticker_AAPL.csv for example
                                                  

                                                  • data 看起来像
                                                  •                 Open      High       Low     Close  Adj Close      Volume ticker
                                                    Date                                                                            
                                                    1986-03-13  0.088542  0.101562  0.088542  0.097222   0.062205  1031788800   MSFT
                                                    1986-03-14  0.097222  0.102431  0.097222  0.100694   0.064427   308160000   MSFT
                                                    1986-03-17  0.100694  0.103299  0.100694  0.102431   0.065537   133171200   MSFT
                                                    1986-03-18  0.102431  0.103299  0.098958  0.099826   0.063871    67766400   MSFT
                                                    1986-03-19  0.099826  0.100694  0.097222  0.098090   0.062760    47894400   MSFT
                                                    

                                                    • 生成的 csv 将如下所示
                                                    • Date,Open,High,Low,Close,Adj Close,Volume,ticker
                                                      1986-03-13,0.0885416641831398,0.1015625,0.0885416641831398,0.0972222238779068,0.0622050017118454,1031788800,MSFT
                                                      1986-03-14,0.0972222238779068,0.1024305522441864,0.0972222238779068,0.1006944477558136,0.06442664563655853,308160000,MSFT
                                                      1986-03-17,0.1006944477558136,0.1032986119389534,0.1006944477558136,0.1024305522441864,0.0655374601483345,133171200,MSFT
                                                      1986-03-18,0.1024305522441864,0.1032986119389534,0.0989583358168602,0.0998263880610466,0.06387123465538025,67766400,MSFT
                                                      1986-03-19,0.0998263880610466,0.1006944477558136,0.0972222238779068,0.0980902761220932,0.06276042759418488,47894400,MSFT
                                                      

                                                      读入上一节保存的多个文件并创建一个数据框

                                                      import pandas as pd
                                                      from pathlib import Path
                                                      
                                                      # set the path to the files
                                                      p = Path('c:/path_to_files')
                                                      
                                                      # find the files; this is a generator, not a list
                                                      files = p.glob('ticker_*.csv')
                                                      
                                                      # read the files into a dataframe
                                                      df = pd.concat([pd.read_csv(file) for file in files])
                                                      

                                                      这篇关于如何处理用yfinance下载的多级列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                                                      本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Split a Pandas column of lists into multiple columns(将 Pandas 的列表列拆分为多列)
                  How does the @property decorator work in Python?(@property 装饰器在 Python 中是如何工作的?)
                  What is the difference between old style and new style classes in Python?(Python中的旧样式类和新样式类有什么区别?)
                  How to break out of multiple loops?(如何打破多个循环?)
                  How to put the legend out of the plot(如何将传说从情节中剔除)
                  Why is the output of my function printing out quot;Nonequot;?(为什么我的函数输出打印出“无?)
                  <legend id='Avm7Z'><style id='Avm7Z'><dir id='Avm7Z'><q id='Avm7Z'></q></dir></style></legend>

                    <small id='Avm7Z'></small><noframes id='Avm7Z'>

                      <tfoot id='Avm7Z'></tfoot>
                          <tbody id='Avm7Z'></tbody>

                            <bdo id='Avm7Z'></bdo><ul id='Avm7Z'></ul>

                          • <i id='Avm7Z'><tr id='Avm7Z'><dt id='Avm7Z'><q id='Avm7Z'><span id='Avm7Z'><b id='Avm7Z'><form id='Avm7Z'><ins id='Avm7Z'></ins><ul id='Avm7Z'></ul><sub id='Avm7Z'></sub></form><legend id='Avm7Z'></legend><bdo id='Avm7Z'><pre id='Avm7Z'><center id='Avm7Z'></center></pre></bdo></b><th id='Avm7Z'></th></span></q></dt></tr></i><div id='Avm7Z'><tfoot id='Avm7Z'></tfoot><dl id='Avm7Z'><fieldset id='Avm7Z'></fieldset></dl></div>