Reputation: 330
I'm currently working on a dataframe named df
that requires first to filter it to a new dataframe named df1
based on date
and num_posts column
, next step is set the value of num_posts column to 10
and increment date column by one month
of the filtered df1
.
the logic to filter df
to df1
is:
if date == today & num_posts == 4
.
The last step once i update the selected columns for df1
mentioned above, is to update df
from df1
with this line of code df.update(df1)
What is working.
I'm able to filter df
to df1
based on the logic and update date and num_posts
columns within the filtered df1
.
What is not working
When i try to update df
from df1
, it only updates the date column
, but cannot update num_posts
values.
My code:
import pandas as pd
import datetime
df = pd.DataFrame({'num_posts': [5, 4, 4, 4, 1, 14],
'date': ['2022-06-10', '2022-06-14',
'2022-06-14', '2020-09-12',
'2020-09-29', '2020-10-15'],
'user': ['user4', 'user1', 'user1', 'user3', 'user4', 'user4']})
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d')
# Logic one start
# Get current date
new_date = datetime.datetime.now()
current_date = new_date.strftime("%Y-%m-%d")
# filter posts that equal 4 and date equal today
df1 = df.loc[(df['num_posts'] == 4) & (df['date'] == current_date)].copy()
# # overwrite the num_posts column with 10
df1.loc[df1['num_posts'] == 4, 'num_posts'] = 10
# df1.replace({'num_posts': {4: 10}}, inplace=True)
# Increment date by one month
plus_month_period = 1
df1 = df1['date'] + pd.DateOffset(months=plus_month_period)
df1
# updating the old dataframe
df.update(df1)
df
When i run my code i get the following output which is not the intended output.
num_posts date user
0 5 2022-06-10 user4
1 4 2022-07-14 user1
2 4 2022-07-14 user1
3 4 2020-09-12 user3
4 1 2020-09-29 user4
5 14 2020-10-15 user4
What i Expect as output once i run my code above(have modified it manually).
num_posts date user
0 5 2022-06-10 user4
1 10 2022-07-14 user1
2 10 2022-07-14 user1
3 4 2020-09-12 user3
4 1 2020-09-29 user4
5 14 2020-10-15 user4
What am i doing wrong?
Upvotes: 1
Views: 171
Reputation: 862406
Assign column date
:
df1['date'] = df1['date'] + pd.DateOffset(months=plus_month_period)
Or:
df1['date'] += pd.DateOffset(months=plus_month_period)
df.update(df1)
print (df)
num_posts date user
0 5.0 2022-06-10 user4
1 10.0 2022-07-14 user1
2 10.0 2022-07-14 user1
3 4.0 2020-09-12 user3
4 1.0 2020-09-29 user4
5 14.0 2020-10-15 user4
Alternative solution is assign to original DataFrame by mask and DataFrame.loc
:
new_date = datetime.datetime.now()
current_date = new_date.strftime("%Y-%m-%d")
plus_month_period = 1
m = (df['num_posts'] == 4) & (df['date'] == current_date)
df.loc[m, 'num_posts'] = 10
df.loc[m, 'date'] += pd.DateOffset(months=plus_month_period)
print (df)
num_posts date user
0 5 2022-06-10 user4
1 10 2022-07-14 user1
2 10 2022-07-14 user1
3 4 2020-09-12 user3
4 1 2020-09-29 user4
5 14 2020-10-15 user4
Upvotes: 1