本文介绍了使用Pandas GroupBy和VALUE_COUNTS查找最常用的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!
问题描述
我正在使用表中的两列。
+-------------+--------------------------------------------------------------+
| Area Name | Code Description |
+-------------+--------------------------------------------------------------+
| N Hollywood | VIOLATION OF RESTRAINING ORDER |
| N Hollywood | CRIMINAL THREATS - NO WEAPON DISPLAYED |
| N Hollywood | CRIMINAL THREATS - NO WEAPON DISPLAYED |
| N Hollywood | ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT |
| Southeast | ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT |
| West Valley | CRIMINAL THREATS - NO WEAPON DISPLAYED |
| West Valley | CRIMINAL THREATS - NO WEAPON DISPLAYED |
| 77th Street | RAPE, FORCIBLE |
| Foothill | CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060 |
| N Hollywood | VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS) 0114 |
+-------------+--------------------------------------------------------------+
我正在使用Groupby和Value_Counts按区域名称查找代码说明。
df.groupby(['Area Name'])['Code Description'].value_counts()
有没有办法只查看每个区域名称的前‘n’个值?如果我将.nlargest(3)
追加到上面的代码,它只返回一个区域名称的结果。
+---------------------------------------------------------------------------------+
| Wilshire SHOPLIFTING-GRAND THEFT ($950.01 & OVER) 7 |
+---------------------------------------------------------------------------------+
推荐答案
使用value_counts
结果中的head
每组:
df.groupby('Area Name')['Code Description'].apply(lambda x: x.value_counts().head(3))
输出:
Area Name
77th Street RAPE, FORCIBLE 1
Foothill CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060 1
N Hollywood CRIMINAL THREATS - NO WEAPON DISPLAYED 2
VIOLATION OF RESTRAINING ORDER 1
ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT 1
Southeast ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT 1
West Valley CRIMINAL THREATS - NO WEAPON DISPLAYED 2
Name: Code Description, dtype: int64
这篇关于使用Pandas GroupBy和VALUE_COUNTS查找最常用的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!
本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!