Reputation: 238
I want to divide a pandas dataframe by another based on the column values. For example let's say I have:
>>> df = pd.DataFrame({'NAME': [ 'CA', 'CA', 'CA', 'AZ', 'AZ', 'AZ', 'TX', 'TX', 'TX'], 'NUM':[1, 2, 3, 1, 2, 3, 1, 2, 3], 'VALUE': [10, 20, 30, 40, 50, 60, 70, 80, 90]})
>>> df
NAME NUM VALUE
0 CA 1 10
1 CA 2 20
2 CA 3 30
3 AZ 1 40
4 AZ 2 50
5 AZ 3 60
6 TX 1 70
7 TX 2 80
8 TX 3 90
>>> states = pd.DataFrame({'NAME': ['CA', "AZ", "TX"], 'DIVISOR': [10, 5, 1]})
>>> states
NAME DIVISOR
0 CA 10
1 AZ 5
2 TX 1
For each STATE and NUM I want to divide the VALUE column in df by the divisor COLUMN of the respective state.
Which would give a result of
>>> result = pd.DataFrame({'NAME': [ 'CA', 'CA', 'CA', 'AZ', 'AZ', 'AZ', 'TX', 'TX', 'TX'], 'NUM':[1, 2, 3, 1, 2, 3, 1, 2, 3], 'VALUE': [1, 2, 3, 8, 10, 12, 70, 80, 90]})
>>> result
NAME NUM VALUE
0 CA 1 1
1 CA 2 2
2 CA 3 3
3 AZ 1 8
4 AZ 2 10
5 AZ 3 12
6 TX 1 70
7 TX 2 80
8 TX 3 90
Upvotes: 0
Views: 1948
Reputation: 884
You can use merge as well
result = df.merge(states,on=['NAME'])
result['NEW VALUE'] = result.VALUE/result.DIVISOR
print(result)
NAME NUM VALUE NEW VALUE DIVISOR
0 CA 1 10 1.0 10
1 CA 2 20 2.0 10
2 CA 3 30 3.0 10
3 AZ 1 40 8.0 5
4 AZ 2 50 10.0 5
5 AZ 3 60 12.0 5
6 TX 1 70 70.0 1
7 TX 2 80 80.0 1
8 TX 3 90 90.0 1
Upvotes: 3
Reputation: 19
I feel like there must be a more eloquent way to accomplish what you are looking for, but this is the rout that I usually take.
myresult = df.copy()
for i in range(len(df['NAME'])):
for j in range(len(states['NAME'])):
if df['NAME'][i] == states['NAME'][j]:
myresult['VALUE'][i] = df['VALUE'][i]/states['DIVISOR'][j]
myresult.head()
Out[10]>>
NAME NUM VALUE
0 CA 1 1
1 CA 2 2
2 CA 3 3
3 AZ 1 8
4 AZ 2 10
This is a very brute force method. You start by looping through each value in the data frame df
, then you loop through each element in the data frame states
. Then for each comparison, you look to see if the NAME
columns match. If they do, you do the VALUE
/ DIVISOR
.
You will get a warring for using the .copy()
method
Upvotes: 0
Reputation: 323226
Let us do map
df['NEW VALUE'] = df['VALUE'].div(df['NAME'].map(states.set_index('NAME')['DIVISOR']))
df
Out[129]:
NAME NUM VALUE NEW VALUE
0 CA 1 10 1.0
1 CA 2 20 2.0
2 CA 3 30 3.0
3 AZ 1 40 8.0
4 AZ 2 50 10.0
5 AZ 3 60 12.0
6 TX 1 70 70.0
7 TX 2 80 80.0
8 TX 3 90 90.0
Upvotes: 4