Reputation: 109
I have a dataframe:
df = pd.DataFrame({'date': ['2013-04-01','2013-04-01','2013-04-01','2013-04-02', '2013-04-02'],
'month': ['1','1','3','3','5'],
'pmonth': ['1', '1', '2', '5', '5'],
'duration': [30, 15, 20, 15, 30],
'pduration': ['10', '20', '30', '40', '50']})
I have to divide duration
and pduration
by value column of second dataframe where date and month of two df
match. The second df
is:
df = pd.DataFrame({'date': ['2013-04-01','2013-04-02','2013-04-03','2013-04-04', '2013-04-05'],
'month': ['1','1','3','3','5'],
'value': ['1', '1', '2', '5', '5'],
})
The second df
is grouped by date and month, so duplicate combination of date month won't be present in the second df
.
Upvotes: 3
Views: 2503
Reputation: 862591
First is necessary check if same dtypes
of column date
and month
in both DataFrames
and if numeric for columns for divide:
#convert to numeric
df1['pduration'] = df1['pduration'].astype(int)
df2['value'] = df2['value'].astype(int)
print (df1.dtypes)
date object
month object
pmonth object
duration int64
pduration int32
print (df2.dtypes)
date object
month object
value int32
dtype: object
Then merge
with left join and divide by DataFrame.div
df = df1.merge(df2, on=['date', 'month'], how='left')
df[['duration_new','pduration_new']] = df[['duration','pduration']].div(df['value'], axis=0)
print (df)
date month pmonth duration pduration value duration_new \
0 2013-04-01 1 1 30 10 1.0 30.0
1 2013-04-01 1 1 15 20 1.0 15.0
2 2013-04-01 3 2 20 30 NaN NaN
3 2013-04-02 3 5 15 40 NaN NaN
4 2013-04-02 5 5 30 50 NaN NaN
pduration_new
0 10.0
1 20.0
2 NaN
3 NaN
4 NaN
For remove value
column use pop
:
df[['duration_new','pduration_new']] = (df[['duration','pduration']]
.div(df.pop('value'), axis=0))
print (df)
date month pmonth duration pduration duration_new pduration_new
0 2013-04-01 1 1 30 10 30.0 10.0
1 2013-04-01 1 1 15 20 15.0 20.0
2 2013-04-01 3 2 20 30 NaN NaN
3 2013-04-02 3 5 15 40 NaN NaN
4 2013-04-02 5 5 30 50 NaN NaN
Upvotes: 3
Reputation: 4199
you can merge the two dataframes, where the date and month match the value column will be added to the first data frame. If there is no match it will represented by NaN. You can then do division operation. see code below
Assuming your second dataframe is df2, then
df3 = df2.merge(df, how = 'right')
for col in ['duration','pduration']:
df3['new_'+col] = df3[col].astype(float)/df3['value'].astype(float)
df3
results in
date month value pmonth duration pduration newduration newpduration
0 2013-04-01 1 1 1 30 10 30.0 10.0
1 2013-04-01 1 1 1 15 20 15.0 20.0
2 2013-04-01 3 NaN 2 20 30 NaN NaN
3 2013-04-02 3 NaN 5 15 40 NaN NaN
4 2013-04-02 5 NaN 5 30 50 NaN NaN
Upvotes: 0
Reputation: 143
You can merge the second df into the first df and then divide.
Consider the first df as df1
and second df as df2
df1 = df1.merge(df2, on=['date', 'month'], how='left').fillna(1)
df1
date month pmonth duration pduration value
0 2013-04-01 1 1 30 10 1
1 2013-04-01 1 1 15 20 1
2 2013-04-01 3 2 20 30 1
3 2013-04-02 3 5 15 40 1
4 2013-04-02 5 5 30 50 1
df1['duration'] = df1['duration'] / df1['value']
df1['pduration'] = df1['pduration'] / df1['value']
df1.drop('value', axis=1, inplace=True)
Upvotes: 0