Denzel
Denzel

Reputation: 358

Pandas read csv dataframe rows from specific date and time range

I have a dataframe from CSV file as follows,

         TimeStamp
0        12/7/2017 8:00
1        12/7/2017 7:00
2        12/7/2017 6:00
3        12/7/2017 5:00
4        12/7/2017 4:00
5        12/7/2017 3:00
6        12/7/2017 2:00
7        12/7/2017 1:00
8        12/7/2017 0:00
9       11/7/2017 23:00
10      11/7/2017 22:00
...
9996     3/12/2015 6:00
9997     3/12/2015 5:00
9998     3/12/2015 4:00
9999     3/12/2015 3:00
Name: TimeStamp, Length: 10000, dtype: object

I am trying to use Pandas to read the data from a specific date and time range for example, from 11/7/2017 8:00 to 12/7/2017 8:00.

I have tried using Boolean mask, DatetimeIndex and .Between methods and it read data out of that range and from 2016 and 2015 as well. Here are my codes,

import pandas as pd

eurusd = pd.read_csv('fxhistoricaldata_EURUSD_hour.csv')
eurusd = eurusd[(eurusd['TimeStamp'] >= '11/7/2017 8:00') & 
(eurusd['TimeStamp'] <= '12/7/2017 8:00')]

print(eurusd['TimeStamp'])

or using .between,

eurusd = eurusd[eurusd['TimeStamp'].between('11/7/2017 8:00', '12/7/2017 8:00')]

The results are as such,

2        12/7/2017 6:00
3        12/7/2017 5:00
4        12/7/2017 4:00
5        12/7/2017 3:00
6        12/7/2017 2:00
7        12/7/2017 1:00
8        12/7/2017 0:00
23       11/7/2017 9:00
24       11/7/2017 8:00
513     12/6/2017 23:00
514     12/6/2017 22:00
515     12/6/2017 21:00
516     12/6/2017 20:00
517     12/6/2017 19:00
518     12/6/2017 18:00
519     12/6/2017 17:00
520     12/6/2017 16:00
521     12/6/2017 15:00
522     12/6/2017 14:00
523     12/6/2017 13:00
524     12/6/2017 12:00
525     12/6/2017 11:00
         ...       
8827     12/2/2016 5:00
8828     12/2/2016 4:00
8829     12/2/2016 3:00
Name: TimeStamp, Length: 305, dtype: object

Can anyone help me rectify my problem or are there any function that can help me fulfill my task? Any help is greatly appreciated!

Upvotes: 3

Views: 4122

Answers (2)

snapcrack
snapcrack

Reputation: 1811

You can do this with the datetime and dateutil libraries:

from dateutil import parser
from datetime import datetime

df['TimeStamp'] = df['TimeStamp'].apply(lambda x: parser.parse(x))

df[(df['TimeStamp'] < datetime(2017, 12, 7)) & (df['TimeStamp']> datetime(2015, 3, 12))] #or whichever dates you're looking for

Upvotes: 1

Gal Dreiman
Gal Dreiman

Reputation: 4009

You need to Ensure eurusd['TimeStamp'] is a series with dtype datetime64[ns]:

eurusd['TimeStamp'] = pd.to_datetime(eurusd['TimeStamp'])

Create a boolean mask:

mask = (eurusd['TimeStamp'] > start_date) & (eurusd['TimeStamp'] <= end_date)

Now, Re-assign (or whatever you want to do with the output):

eurusd = eurusd.loc[mask]

Upvotes: 3

Related Questions