本文介绍了组内的 Cumsum 并在 pandas 的条件下重置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!
问题描述
I have a dataframe with two columns ID and Activity. The activity is either 0 or 1. I want a new column containing a increasing number since the last activity was 1. However, the count should only be within one group (ID). If the activity is 1, the counting column should be reset to 0 and the count starts again.
So, I have a dataframe containing the following:
What is want is this:
Can someone help me?
解决方案
We using a new para 'G' here
df['G']=df.groupby('ID').Activeity.apply(lambda x :(x.diff().ne(0)&x==1)|x==1)
df.groupby([df.ID,df.G.cumsum()]).G.apply(lambda x : (~x).cumsum())
Out[713]:
0 1
1 2
2 0
3 1
4 2
5 1
6 2
7 0
8 1
9 0
10 1
11 1
12 0
13 0
14 1
15 2
Name: G, dtype: int32
Data input
df=pd.DataFrame({'ID':list('AAAAABBBBBBCCCCC'),'Activeity':[0,0,1,0,0,0,0,1,0,1,0,0,1,1,0,0]})
Explanation :
Here we get the new para 'G'
df['G']=df.groupby('ID').Activeity.apply(lambda x :(x.diff().ne(0)&x==1)|x==1)
df
Out[134]:
Activeity ID G
0 0 A False
1 0 A False
2 1 A True
3 0 A False
4 0 A False
5 0 B False
6 0 B False
7 1 B True
8 0 B False
9 1 B True
10 0 B False
11 0 C False
12 1 C True
13 1 C True
14 0 C False
15 0 C False
Then we do cumsum
for G, is to getting where is the cycle we should set the number to 0
df.G.cumsum()
Out[135]:
0 0
1 0
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 3
10 3
11 3
12 4
13 5
14 5
15 5
Name: G, dtype: int32
这篇关于组内的 Cumsum 并在 pandas 的条件下重置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!
本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!