Reputation: 34
I have a dataframe which has a timestamp column in seconds since epoch format. It has the dtype float.
It want to filter the dataframe by a specific time window.
Approach:
zombieData[(zombieData['record-ts'] > period_one_start) & (zombieData['record-ts'] < period_one_end)]
This returns an empty dataframe. I can confirm that I have timestamp bigger, smaller and in my timeframe. I calculate my timestamps with the following method:
period_one_start = datetime.strptime('2020-12-06 03:30:00', '%Y-%m-%d %H:%M:%S').timestamp()
I'm glad for any help. I guess my filtering logic is wrong which confuses me, as one condition filtering (e.g. everything after start time) is working.
Thx for your help!
Upvotes: 0
Views: 553
Reputation: 1016
This looks messy but I highly recommend. Converting to pd.Timestamp before will be most robust for ensuring good comparison and calling to numpy methods for less than and greater than will compute a little bit quicker in a majority of situations (especially for larger dataframes).
zombieData[zombieData['record-ts'].gt(pd.Timestamp('2020-12-06')) &
zombieData['record-ts'].lt(pd.Timestamp('2020-12-09'))]
New Option: I learned of the between method. I think this is easier to read.
zombieData[zombieData['record-ts'].between(left=pd.Timestamp('2020-12-06'),
right=pd.Timestamp('2020-12-09'),
inclusive="neither")]
Upvotes: 1
Reputation: 2697
import pandas as pd
from datetime import datetime
import numpy as np
date = np.array('2020-12-01', dtype=np.datetime64)
dates = date + np.arange(12)
period_one_start = datetime.strptime('2020-12-06 03:30:00', '%Y-%m-%d %H:%M:%S').timestamp()
period_one_end = datetime.strptime('2020-12-09 03:30:00', '%Y-%m-%d %H:%M:%S').timestamp()
zombieData = pd.DataFrame( data= {"record-ts": dates} )
zombieData[ ((zombieData['record-ts'] > '2020-12-06') & (zombieData['record-ts'] < '2020-12-09')) ]
(if you want to keep you format)
Upvotes: 0