Reputation: 89
How can I parser all the sales and date column in the dataframe below
import pandas as pd
data = [['30-06-2021', 43578, '31-01-2022', 5000, '28-02-2022', 78564, '31-03-2022', 52353, '30-04-2022'],
['14-06-2021', 4475, '14-01-2022', 2546, '05-02-2022', 5757, '28-03-2022', 2352, '01-04-2022']]
ds = pd.DataFrame(data, columns = ['Start', 'Sales_Jan-22', 'Jan-22', 'Sales_Feb-22', 'Feb-22', 'Sales_Mar-22', 'Mar-22',
'Sales_Apr-22', 'Apr-22'])
Start Sales_Jan-22 Jan-22 Sales_Feb-22 Feb-22 Sales_Mar-22 Mar-22 Sales_Apr-22 Apr-22
0 30-06-2021 43578 31-01-2022 5000 28-02-2022 78564 31-03-2022 52353 30-04-2022
1 14-06-2021 4475 14-01-2022 2546 05-02-2022 5757 28-03-2022 2352 01-04-2022
So far, this is what I have done. However I received an error for the Sales_%b-%y
and %b-%y
from dateutil import parser
gross_date = parser.parse(ds['Start'][0])
sales_col = parser.parse(ds['Sales_%b-%y'][0])
monthly_date = parser.parse(ds['%b-%y'][0])
How should I do the parser for date format? My desired output should be like this
Start Sales_Jan-22 Jan-22 Sales_Feb-22 Feb-22 Sales_Mar-22 Mar-22 Sales_Apr-22 Apr-22 Total
0 30-06-2021 43578 31-01-2022 5000 28-02-2022 78564 31-03-2022 52353 30-04-2022 3525
1 14-06-2021 4475 14-01-2022 2546 05-02-2022 5757 28-03-2022 2352 01-04-2022 63747
Because later I'm going to use is create a formula to get the Total
column as shown above.
Upvotes: 0
Views: 64
Reputation: 23166
IIUC, you can use pd.to_datetime
to force convert all the date values and leave the numbers unchanged:
ds = ds.apply(pd.to_datetime, errors="ignore", format="%d-%m-%Y")
To check the values have been converted:
>>> ds["Start"].iat[1]
Timestamp('2021-06-14 00:00:00')
>>> ds["Jan-22"].iat[0]
Timestamp('2022-01-31 00:00:00')
Upvotes: 1