cubecubed
cubecubed

Reputation: 238

Pandas Divide Dataframe by Another Based on Column Values

I want to divide a pandas dataframe by another based on the column values. For example let's say I have:

>>> df = pd.DataFrame({'NAME': [ 'CA', 'CA', 'CA', 'AZ', 'AZ', 'AZ', 'TX', 'TX', 'TX'], 'NUM':[1, 2, 3, 1, 2, 3, 1, 2, 3], 'VALUE': [10, 20, 30, 40, 50, 60, 70, 80, 90]})
>>> df
  NAME  NUM  VALUE
0   CA    1     10
1   CA    2     20
2   CA    3     30
3   AZ    1     40
4   AZ    2     50
5   AZ    3     60
6   TX    1     70
7   TX    2     80
8   TX    3     90
>>> states = pd.DataFrame({'NAME': ['CA', "AZ", "TX"], 'DIVISOR': [10, 5, 1]})
>>> states
  NAME  DIVISOR
0   CA       10
1   AZ        5
2   TX        1

For each STATE and NUM I want to divide the VALUE column in df by the divisor COLUMN of the respective state.

Which would give a result of

>>> result = pd.DataFrame({'NAME': [ 'CA', 'CA', 'CA', 'AZ', 'AZ', 'AZ', 'TX', 'TX', 'TX'], 'NUM':[1, 2, 3, 1, 2, 3, 1, 2, 3], 'VALUE': [1, 2, 3, 8, 10, 12, 70, 80, 90]})
>>> result
  NAME  NUM  VALUE
0   CA    1      1
1   CA    2      2
2   CA    3      3
3   AZ    1      8
4   AZ    2     10
5   AZ    3     12
6   TX    1     70
7   TX    2     80
8   TX    3     90

Upvotes: 0

Views: 1948

Answers (3)

Arun Joy Thekkiniyath
Arun Joy Thekkiniyath

Reputation: 884

You can use merge as well

result = df.merge(states,on=['NAME'])
result['NEW VALUE'] = result.VALUE/result.DIVISOR
print(result)

NAME  NUM  VALUE  NEW VALUE  DIVISOR
0   CA    1     10        1.0       10
1   CA    2     20        2.0       10
2   CA    3     30        3.0       10
3   AZ    1     40        8.0        5
4   AZ    2     50       10.0        5
5   AZ    3     60       12.0        5
6   TX    1     70       70.0        1
7   TX    2     80       80.0        1
8   TX    3     90       90.0        1

Upvotes: 3

Lemon
Lemon

Reputation: 19

I feel like there must be a more eloquent way to accomplish what you are looking for, but this is the rout that I usually take.

myresult = df.copy()
for i in range(len(df['NAME'])):
    for j in range(len(states['NAME'])):
        if  df['NAME'][i] == states['NAME'][j]:
            myresult['VALUE'][i] = df['VALUE'][i]/states['DIVISOR'][j]
    
myresult.head()

Out[10]>>
  NAME  NUM VALUE
0   CA  1   1
1   CA  2   2
2   CA  3   3
3   AZ  1   8
4   AZ  2   10

This is a very brute force method. You start by looping through each value in the data frame df, then you loop through each element in the data frame states. Then for each comparison, you look to see if the NAME columns match. If they do, you do the VALUE / DIVISOR.

You will get a warring for using the .copy() method

Upvotes: 0

BENY
BENY

Reputation: 323226

Let us do map

df['NEW VALUE'] = df['VALUE'].div(df['NAME'].map(states.set_index('NAME')['DIVISOR']))
df
Out[129]: 
  NAME  NUM  VALUE  NEW VALUE
0   CA    1     10        1.0
1   CA    2     20        2.0
2   CA    3     30        3.0
3   AZ    1     40        8.0
4   AZ    2     50       10.0
5   AZ    3     60       12.0
6   TX    1     70       70.0
7   TX    2     80       80.0
8   TX    3     90       90.0

Upvotes: 4

Related Questions