<legend id='XYrVr'><style id='XYrVr'><dir id='XYrVr'><q id='XYrVr'></q></dir></style></legend>
  • <small id='XYrVr'></small><noframes id='XYrVr'>

  • <tfoot id='XYrVr'></tfoot>
    <i id='XYrVr'><tr id='XYrVr'><dt id='XYrVr'><q id='XYrVr'><span id='XYrVr'><b id='XYrVr'><form id='XYrVr'><ins id='XYrVr'></ins><ul id='XYrVr'></ul><sub id='XYrVr'></sub></form><legend id='XYrVr'></legend><bdo id='XYrVr'><pre id='XYrVr'><center id='XYrVr'></center></pre></bdo></b><th id='XYrVr'></th></span></q></dt></tr></i><div id='XYrVr'><tfoot id='XYrVr'></tfoot><dl id='XYrVr'><fieldset id='XYrVr'></fieldset></dl></div>
      <bdo id='XYrVr'></bdo><ul id='XYrVr'></ul>

      1. GROUP BY+新列+基于条件的前一行抓取值

        Group by + New Column + Grab value former row based on conditionals(GROUP BY+新列+基于条件的前一行抓取值)

          1. <tfoot id='8RPS3'></tfoot>
            <legend id='8RPS3'><style id='8RPS3'><dir id='8RPS3'><q id='8RPS3'></q></dir></style></legend>
          2. <small id='8RPS3'></small><noframes id='8RPS3'>

              <bdo id='8RPS3'></bdo><ul id='8RPS3'></ul>
                  <tbody id='8RPS3'></tbody>
                • <i id='8RPS3'><tr id='8RPS3'><dt id='8RPS3'><q id='8RPS3'><span id='8RPS3'><b id='8RPS3'><form id='8RPS3'><ins id='8RPS3'></ins><ul id='8RPS3'></ul><sub id='8RPS3'></sub></form><legend id='8RPS3'></legend><bdo id='8RPS3'><pre id='8RPS3'><center id='8RPS3'></center></pre></bdo></b><th id='8RPS3'></th></span></q></dt></tr></i><div id='8RPS3'><tfoot id='8RPS3'></tfoot><dl id='8RPS3'><fieldset id='8RPS3'></fieldset></dl></div>
                • 本文介绍了GROUP BY+新列+基于条件的前一行抓取值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有这套

                  df = pd.DataFrame({'user':[1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,4],
                                    'date':['1995-09-01','1995-09-02','1995-10-03','1995-10-04','1995-10-05','1995-11-07','1995-11-08','1995-11-09','1995-11-10','1995-11-15','1995-12-18','1995-12-19','1995-12-20','1995-12-23','1995-12-26','1995-12-27'],
                                    'dc':['1995-09-02','1995-09-02','1995-10-02','1995-10-05','1995-10-05','1995-11-05','1995-11-05','1995-11-10','1995-11-10','1995-11-10','1995-12-10','1995-12-23','1995-12-23','1995-12-23','1995-12-23','1995-12-23'],
                                    'tp':['s','c','f','s','c','c','f','s','c','s','f','s','s','c','s','f'],
                                    'vt':['0','1','0','0','1','0','0','0','1','0','0','0','0','1','0','0'],
                                    'c1':['1','5','0','2','3','9','3','2','0','5','5','6','4','0','6','0'],
                                    'c2':['3','4','0','2','5','3','8','4','0','6','2','7','0','0','8','0'],
                                    'c3':['5','5','2','5','6','4','2','4','4','6','3','4','3','8','2','7']})
                  df
                  

                  这提供了:

                  user    date        dc     tp   vt  c1   c2  c3
                   1  1995-09-01  1995-09-02  s   0    1   3   5
                   1  1995-09-02  1995-09-02  c   1    5   4   5
                   1  1995-10-03  1995-10-02  f   0    0   0   2
                   2  1995-10-04  1995-10-05  s   0    2   2   5
                   2  1995-10-05  1995-10-05  c   1    3   5   6
                   2  1995-11-07  1995-11-05  c   0    9   3   4
                   2  1995-11-08  1995-11-05  f   0    3   8   2
                   3  1995-11-09  1995-11-10  s   0    2   4   4
                   3  1995-11-10  1995-11-10  c   1    0   0   4
                   3  1995-11-15  1995-11-10  s   0    5   6   6
                   3  1995-12-18  1995-12-10  f   0    5   2   3
                   4  1995-12-19  1995-12-23  s   0    6   7   4
                   4  1995-12-20  1995-12-23  s   0    4   0   3
                   4  1995-12-23  1995-12-23  c   1    0   0   8
                   4  1995-12-26  1995-12-23  s   0    6   8   2
                   4  1995-12-27  1995-12-23  f   0    0   0   7
                  
                  我想创建新列CREATE新列df[‘dc2’],其中groupby user,列df[‘dc2’]=df[‘dc’]。 然而,如果df[‘dc’]满足条件‘tp’=‘c’&;‘Vt’=1&;‘c1’=0&;‘c2’=0, 然后抓取前一个条目的日期(用户的原始数据)

                  #ie.对于用户3,在df[‘dc’]列上,如果我们查看条目‘tp’=‘c’&;‘vt’=1,我们可以看到它具有‘c1’=0和‘c2’=0, #因此df[‘dc2’]的值将(对于用户3)为‘1995-11-09’,而不是‘1995-11-10’

                  #ie.对于用户4,在df[‘dc’]列中,如果我们查看条目‘tp’=‘c’&;‘vt’=1,我们可以看到它具有‘c1’=0和‘c2’=0, 在这种情况下,df[‘dc2’]应该(对于用户4)是‘1995-12-20’,而不是‘1995-12-23’

                  以下是所需结果:

                  user    date       dc           dc2     tp   vt c1  c2  c3
                  1   1995-09-01  1995-09-02  1995-09-02   s   0   1   3   5
                  1   1995-09-02  1995-09-02  1995-09-02   c   1   5   4   5
                  1   1995-10-03  1995-10-02  1995-10-02   f   0   0   0   2
                  2   1995-10-04  1995-10-05  1995-10-05   s   0   2   2   5
                  2   1995-10-05  1995-10-05  1995-10-05   c   1   3   5   6
                  2   1995-11-07  1995-11-05  1995-11-05   c   0   9   3   4
                  2   1995-11-08  1995-11-05  1995-11-05   f   0   3   8   2
                  3   1995-11-09  1995-11-10  1995-11-09   s   0   2   4   4
                  3   1995-11-10  1995-11-10  1995-11-09   c   1   0   0   4
                  3   1995-11-15  1995-11-10  1995-11-09   s   0   5   6   6
                  3   1995-12-18  1995-12-10  1995-12-09   f   0   5   2   3
                  4   1995-12-19  1995-12-23  1995-12-20   s   0   6   7   4
                  4   1995-12-20  1995-12-23  1995-12-20   s   0   4   0   3
                  4   1995-12-23  1995-12-23  1995-12-20   c   1   0   0   8
                  4   1995-12-26  1995-12-23  1995-12-20   s   0   6   8   2
                  4   1995-12-27  1995-12-23  1995-12-20   f   0   0   0   7
                  

                  推荐答案

                  我们创建一个表示条件tp=c&;vt=1&;c1=0&;c2=0的布尔掩码,然后对列user应用GROUPBY,并应用自定义转换函数f,该函数根据条件选择前一行的值:

                  m = df['tp'].eq('c') & df['vt'].eq('1')
                       & df['c1'].eq('0') & df['c2'].eq('0')
                       
                  f = lambda s: s.mask(~m.shift(-1, fill_value=False)).ffill().bfill()
                  df['dc2'] = df.groupby('user')['date'].apply(f).fillna(df['dc'])
                  

                      user        date          dc tp vt c1 c2 c3         dc2
                  0      1  1995-09-01  1995-09-02  s  0  1  3  5  1995-09-02
                  1      1  1995-09-02  1995-09-02  c  1  5  4  5  1995-09-02
                  2      1  1995-10-03  1995-10-02  f  0  0  0  2  1995-10-02
                  3      2  1995-10-04  1995-10-05  s  0  2  2  5  1995-10-05
                  4      2  1995-10-05  1995-10-05  c  1  3  5  6  1995-10-05
                  5      2  1995-11-07  1995-11-05  c  0  9  3  4  1995-11-05
                  6      2  1995-11-08  1995-11-05  f  0  3  8  2  1995-11-05
                  7      3  1995-11-09  1995-11-10  s  0  2  4  4  1995-11-09
                  8      3  1995-11-10  1995-11-10  c  1  0  0  4  1995-11-09
                  9      3  1995-11-15  1995-11-10  s  0  5  6  6  1995-11-09
                  10     3  1995-12-18  1995-12-10  f  0  5  2  3  1995-11-09
                  11     4  1995-12-19  1995-12-23  s  0  6  7  4  1995-12-20
                  12     4  1995-12-20  1995-12-23  s  0  4  0  3  1995-12-20
                  13     4  1995-12-23  1995-12-23  c  1  0  0  8  1995-12-20
                  14     4  1995-12-26  1995-12-23  s  0  6  8  2  1995-12-20
                  15     4  1995-12-27  1995-12-23  f  0  0  0  7  1995-12-20
                  

                  这篇关于GROUP BY+新列+基于条件的前一行抓取值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  groupby multiple coords along a single dimension in xarray(在xarray中按单个维度的多个坐标分组)
                  Group by and Sum in Pandas without losing columns(Pandas中的GROUP BY AND SUM不丢失列)
                  Is there a way of group by month in Pandas starting at specific day number?( pandas 有从特定日期开始的按月分组的方式吗?)
                  Groupby and interpolate in Pandas(PANDA中的Groupby算法和插值算法)
                  Pandas - Group Rows based on a column and replace NaN with non-null values(PANAS-基于列对行进行分组,并将NaN替换为非空值)
                  Grouping pandas DataFrame by 10 minute intervals(按10分钟间隔对 pandas 数据帧进行分组)

                      <tbody id='33cl1'></tbody>

                  • <legend id='33cl1'><style id='33cl1'><dir id='33cl1'><q id='33cl1'></q></dir></style></legend>

                    1. <i id='33cl1'><tr id='33cl1'><dt id='33cl1'><q id='33cl1'><span id='33cl1'><b id='33cl1'><form id='33cl1'><ins id='33cl1'></ins><ul id='33cl1'></ul><sub id='33cl1'></sub></form><legend id='33cl1'></legend><bdo id='33cl1'><pre id='33cl1'><center id='33cl1'></center></pre></bdo></b><th id='33cl1'></th></span></q></dt></tr></i><div id='33cl1'><tfoot id='33cl1'></tfoot><dl id='33cl1'><fieldset id='33cl1'></fieldset></dl></div>
                    2. <small id='33cl1'></small><noframes id='33cl1'>

                        <bdo id='33cl1'></bdo><ul id='33cl1'></ul>

                            <tfoot id='33cl1'></tfoot>