ruby
ruby

Reputation: 71

Addition of days to all the date columns in pandas using for loop

I have a dataframe of date strings which is a subset of big data frame.The first 5 rows of this dataframe are

dates_df = pd.DataFrame(
    {'date1':['2020-06-16','2020-06-16','2020-06-17','2020-06-21','2020-06-11'],
     'date2':['2020-06-10','2020-06-10','2020-06-10','2020-06-10','2020-06-10'],
     'date3':['2018-12-20','2018-12-19','2019-09-20','2019-05-08','2020-06-13'],
     'date4':['2020-07-07','2020-06-30','2020-07-01','2020-07-03','2020-07-24']})

I have declared a variable called new_date='2020-06-29'

Define a variable called diff_date such that diff_date=new_date-dates_df['date2'] Then this diff_date should be added to all the date values in each date columnwhich should be my final_df. The steps I have taken so far are converting these values to pd.to_datetime() and got the difference also. Got final df with the given answer given. ############################ Modified the above data frame a bit and wrote a function like this.

dates_df=pd.DataFrame(
 {'date1':['2020-06-16','2020-06-16','2020-06-17','2020-06-21','2020-06-    
 11'],'date2':['2020-06-10','2020-08-10','2020-06-10','2020-09-13','2020-06- 
 10'],'date3':['2018-12-20','2018-12-19','2019-09-20','2019-05-08','2020-06- 
 13'],'date4':['2020-07-07','2020-06-30','2020-07-01','2020-07-03','2020-07- 
 24']}) 

def datechange_func(dataset,variablelist,new_run_date,temp_run_date=None):
   dataset[variablelist]=dataset[variablelist].apply(pd.to_datetime)
   li= list(dataset['date2'].unique())
   if dataset.loc[dataset['date2'].any().isin(li)]:
       dataset['date2'] = max(li)
   else:
       dataset['date2'] = temp_run_date

   diff = new_run_date - dataset['date2']

   dataset[variablelist]=dataset[variablelist].add(diff,axis=0)

   return dataset

variablelist1=['date1', 'date2', 'date3', 'date4'] final_df=datechange_func(dates_df,variablelist1,new_run_date,temp_run_date=None)

It should update dates_df['date2'] to get the maximum unique value in all the rows of date2 column But I get this error TypeError: invalid_op() got an unexpected keyword argument 'skipna' How to rectify this?

Upvotes: 2

Views: 78

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Using pd.to_datetime convert the date like columns to pandas datetime series, then use DataFrame.add along axis=0 to add diff to the dataframe:

dates_df = dates_df.apply(pd.to_datetime)
diff = pd.Timestamp('2020-06-29') - dates_df['date2']
final_df = dates_df.add(diff, axis=0)

Result:

print(final_df)
      date1      date2      date3      date4
0 2020-07-05 2020-06-29 2019-01-08 2020-07-26
1 2020-07-05 2020-06-29 2019-01-07 2020-07-19
2 2020-07-06 2020-06-29 2019-10-09 2020-07-20
3 2020-07-10 2020-06-29 2019-05-27 2020-07-22
4 2020-06-30 2020-06-29 2020-07-02 2020-08-12

Upvotes: 2

Related Questions