Reputation: 529
I have a dataframe and dictionary like the following:
df =
name characteristic value
bob job doctor
bob age 25
jim job doctor
jim age 25
jim height 6'
mydict = { 'bob': 10, 'jim': 4 }
The dictionary describes a multiplier value for all rows that have a particular name.
I want to count the number of duplicate characteristic and value pairs in this dataframe, but then provide a multiplier value to that count, where the multiplier is the value in my dictionary.
The dataframe I am trying to obtain would look something like this:
df =
name characteristic value count multiplier total
bob job doctor 2 10 20
bob age 25 2 10 20
jim job doctor 2 4 8
jim age 25 2 4 8
jim height 6' 1 4 4
I am able to produce the count column, but am totally stuck appending the dictionary into the dataframe. How could I create the multiplier column in the final dataframe shown above using my original df and dictionary?
Upvotes: 1
Views: 440
Reputation: 402573
I've broken down the steps for you:
Use groupby
+ transform
to get counts of values -
df['count'] = df.groupby('value').value.transform('count')
Use pd.Series.map
to map names to multipliers -
df['multiplier'] = df['name'].map(mydict)
On older versions, you may consider df['multiplier'] = df['name'].replace(mydict)
instead.
Finally, compute the total, this is straightforward.
df['total'] = df['count'] * df['multiplier']
df
name characteristic value count multiplier total
0 bob job doctor 2 10 20
1 bob age 25 2 10 20
2 jim job doctor 2 4 8
3 jim age 25 2 4 8
4 jim height 6' 1 4 4
Upvotes: 4