Reputation: 12204
I have some data with an id
and val
columns in DataFrame df
and some scaling factors in DataFrame scaling
such that it has a scaling factor for each value in id
df = pd.DataFrame(data=dict(id=['a', 'a', 'a', 'b', 'b', 'c'], val=[1, 2, 3, 10, 11, 100]))
scaling = pd.DataFrame(data=dict(id=['a', 'b', 'c'], scaling=[1, 0.1, 0.01]))
These look like this:
In[23]: df
Out[24]:
id val
0 a 1
1 a 2
2 a 3
3 b 10
4 b 11
5 c 100
scaling
Out[25]:
id scaling
0 a 1.00
1 b 0.10
2 c 0.01
I now want to multiply the data in df
by the scaling factors. I can do it this way but this feels awkward and is possibly inefficient?
df['val'] = df['val'] * df.merge(scaling, left_on='id', right_on='id')['scaling']
Is there a better way of applying the factors?
Upvotes: 2
Views: 707
Reputation: 889
You can map by creating a dictionary from the columns in your scaling dtaframe.
>>> df = pd.DataFrame(data=dict(id=['a', 'a', 'a', 'b', 'b', 'c'], val=[1, 2, 3, 10, 11, 100]))
>>> scaling = pd.DataFrame(data=dict(id=['a', 'b', 'c'], scaling=[1, 0.1, 0.01]))
>>> scaling_dict = dict(zip(scaling['id'], scaling['scaling']))
>>> df['multiplier'] = df['id'].map(scaling_dict)
>>> df['val'] = df['val']*df['multiplier']
>>> df
id val multiplier
0 a 1.0 1.00
1 a 2.0 1.00
2 a 3.0 1.00
3 b 1.0 0.10
4 b 1.1 0.10
5 c 1.0 0.01
Upvotes: 0
Reputation: 150815
You can use map
with set_index
on scaling
:
df['val'] * df['id'].map(scaling.set_index('id').scaling)
Output:
0 1.0
1 2.0
2 3.0
3 1.0
4 1.1
5 1.0
dtype: float64
Upvotes: 2