How to parse CSV file using pandas?

Question

Now I have a .csv file, with a column of time, such that "20140203 00:00:03.132", how can I drop the seconds part(":03.132") efficiently? The data amount is huge, and I tried preprocess the data using sed but it was too slow!

I am now trying parse the .csv file in pandas. Is there anyway I could handle that efficiently? Methods other than pandas is also welcomed!

olofom · Accepted Answer

There is a handy library for parsing timestamps: datetime:

import datetime
x = '20140203 00:00:03.132'
timestamp = datetime.datetime.strptime(x, '%Y%m%d %H:%M:%S.%f')
print datetime.datetime.strftime(timestamp, '%Y%m%d %H:%M')  # 20140203 00:00

Or since it's a bit slow for a huge amount of data, you can split from the right on the first : and then take the first element of the resulting list:

print x.rsplit(':', 1)[0]  # 20140203 00:00

How to parse CSV file using pandas?

Answers (2)

Related Questions