Jason
Jason

Reputation: 65

How can I group by a part of timestamp value in pandas?

I try to split and output the csv file. I must use the date to be the file name but don't need the time. So I want to split the Order_Date, which is a timestamp that has both date and time.
How can I group by a part of value in pandas? There is my code:

import csv
import re
import pandas as pd
import os

df = pd.read_csv('test.csv',delimiter='|')

for i,x in df.groupby('Order_Date'):
    p = os.path.join(r'~/Desktop/',("data_{}.csv").format(i.lower()))
    x.to_csv(p,sep = '|', index=False)

Now I can get this:

data_2019-07-23 00:06:00.csv
data_2019-07-23 00:06:50.csv
data_2019-07-23 00:06:55.csv
data_2019-07-28 12:31:00.csv

Example test.csv data:

Channel|Store_ID|Store_Code|Store_Type|Order_ID|Order_Date|Member_ID|Member_Tier|Coupon_ID|Order_Total|Material_No|Material_Name|Size|Quantity|Unit_Price|Line_Total|Discount_Amount
ECOM|ECOM|ECOM|ECOM|A190700|2019-07-23 00:06:00||||1064.00|7564|Full Zip|750|1.00|399.00|168.00|231.00
ECOM|ECOM|ECOM|ECOM|A190700|2019-07-23 00:06:00||||1064.00|1361|COOL TEE|200|1.00|199.00|84.00|115.00
ECOM|ECOM|ECOM|ECOM|A190700|2019-07-23 00:06:00||||1064.00|7699|PANT|690|1.00|499.00|210.00|289.00
ECOM|ECOM|ECOM|ECOM|A190700|2019-07-23 00:06:00||||1064.00|8700|AI DRESS|690|1.00|399.00|196.00|203.00
ECOM|ECOM|ECOM|ECOM|A190700|2019-07-23 00:06:50||||1064.00|8438|COPA|690|1.00|229.00|112.00|117.00
ECOM|ECOM|ECOM|ECOM|A190700|2019-07-23 00:06:55||||1064.00|8324|CLASS|350|1.00|599.00|294.00|305.00
ECOM|ECOM|ECOM|ECOM|A190701|2019-07-28 12:31:00||||798.00|3689|DRESS|500|1.00|699.00|294.00|405.00

Expect I get this:

data_2019-07-23.csv
data_2019-07-28.csv

Any help would be very much appreciated.

Upvotes: 0

Views: 344

Answers (2)

jukeboX
jukeboX

Reputation: 36

You can try making the i a string and then using .split() and then using the 0 index:

str(i).split()[0]

so replaced in your code:

for i,x in df.groupby('Order_Date'):
    p = os.path.join(r'~/Desktop/',("data_{}.csv").format(str(i).split()[0]))
    x.to_csv(p,sep = '|', index=False)

Upvotes: 0

xyzjayne
xyzjayne

Reputation: 1387

You need to convert Order_Date to dates - stripping the time information. One quick way to do this is:

df['Order_Date1'] = pd.to_datetime(df['Order_Date']).dt.strftime('%Y-%m-%d')

Then proceed with a groupby using Order_Date1.

Upvotes: 2

Related Questions