Reputation: 797
I have a pandas groupby series with 3 columns and I would like to make a change on third column according to values from another pandas dataframe
Data1 Data2(unique names)
name col1 col2 name col
a 10 -0.2 x 0.002
b 80 0.3 a 0.004
a 72 1.1 b 0.007
a 54 0.8 ...
b 90 -3.2
create pandas dataframes
df1 = pd.DataFrame.from_dict(Data1)
df2 = pd.DataFrame.from_dict(Data2)
and groupby the first dataframe
df1Groupby = df1.groupby(df1.keys()[0])
get the rows in df2 with common name as in df1
common = {}
for i in df2[df2.keys()[0]]:
if i in df1[df1.keys()[0]].unique():
common[i] = df2[df2.keys()[0]==i][df2.keys()[1]].values
dfcommon = pd.DataFrame.from_dict(common)
now I want to change col2 in Data1 for each names as
col2 = col2 + col1 * col
defining a function
def my_func(group, amt):
group[group.keys()[2]] = group[group.keys()[2]] + group[group.keys()[1]] * amt
return group
and then in the main call the function
for i in dfcommon.index:
df1Groupby.get_group(i).apply(my_func, dfcommon.loc[i].values[0],axis=1)
but then I get the following error
TypeError: apply() got multiple values for argument 'axis'
not sure if my approach is the right way and would appreciate any help.
Upvotes: 4
Views: 1585
Reputation: 1221
This is an simple as using pandas.Series.map. You'll be creating a mapping between 'name' and 'col'. You don't need to 'merge' (join) the two data frames in this case as you're only interested to retrieve one value from the second data frame.
# create a mapper
mapper = Data2.set_index('name')['col']
mapped_value = Data1['name'].map(mapper)
# crete new column
data1['new_column'] = data1['col1'] + data1['col2'] * mapped_value
Hope this is what you're trying to achieve. If not please provide more detail and desired output!
Upvotes: 5