
Python: Sum values in DataFrame if other values match between DataFrames(Python:如果DataFrames之间的其他值匹配,则对DataFrame中的值求和)



I have two dataframes of different length like those:

数据帧 A:

FirstName    LastName
Adam         Smith
John         Johnson

数据帧 B:

First        Last        Value
Adam         Smith       1.2
Adam         Smith       1.5
Adam         Smith       3.0
John         Johnson     2.5

想象一下,我想做的是在DataFrame A"中创建一个新列,将所有具有匹配姓氏的值相加,因此A"中的输出将是:

Imagine that what I want to do is to create a new column in "DataFrame A" summing all the values with matching last names, so the output in "A" would be:

FirstName    LastName    Sums
Adam         Smith       5.7
John         Johnson     2.5

如果我在 Excel 中,我会使用

If I were in Excel, I'd use

=SUMIF(dfB!B:B, B2, dfB!C:C)

在 Python 中,我一直在尝试多种解决方案,但同时使用 np.where、df.sum()、删除索引等,但我迷路了.下面的代码返回ValueError:只能比较标记相同的系列对象",但我认为它无论如何都写不正确.

In Python I've been trying multiple solutions but using both np.where, df.sum(), dropping indexes etc., but I'm lost. Below code is returning "ValueError: Can only compare identically-labeled Series objects", but I don't think it's written correctly anyways.

df_a['Sums'] = df_a[df_a['LastName'] == df_b['Last']].sum()['Value']


Huge thanks in advance for any help.


使用 布尔索引Series.isin 进行过滤然后聚合sum:

df = (df_b[df_b['Last'].isin(df_a['LastName'])]
           .groupby(['First','Last'], as_index=False)['Value']


If want match both, first and last name:

df = (df_b.merge(df_a, left_on=['First','Last'], right_on=['FirstName','LastName'])
           .groupby(['First','Last'], as_index=False)['Value']




patching a class yields quot;AttributeError: Mock object has no attributequot; when accessing instance attributes(修补类会产生“AttributeError:Mock object has no attribute;访问实例属性时)
How to mock lt;ModelClassgt;.query.filter_by() in Flask-SqlAlchemy(如何在 Flask-SqlAlchemy 中模拟 lt;ModelClassgt;.query.filter_by())
FTPLIB error socket.gaierror: [Errno 8] nodename nor servname provided, or not known(FTPLIB 错误 socket.gaierror: [Errno 8] nodename nor servname provided, or not known)
Weird numpy.sum behavior when adding zeros(添加零时奇怪的 numpy.sum 行为)
Why does the #39;int#39; object is not callable error occur when using the sum() function?(为什么在使用 sum() 函数时会出现 int object is not callable 错误?)
How to sum in pandas by unique index in several columns?(如何通过几列中的唯一索引对 pandas 求和?)