Reputation: 109
I am having trouble selecting rows from a dataframe with a datetime column from a CSV file. I have my code and my sample data. There are no rows returned.
import pandas as pd
import numpy as np
col_names = ['date','msft','nok','aapl','ibm','amzn']
stockprices = pd.read_csv('./stockdata.csv', skiprows=1, names=col_names)
stockprices.loc['2018-01-01 9:00:00':'2018-01-01 11:00:00']
CSV file data:
date msft nok aapl ibm amzn
2018-01-01 08:00:00 107 2 161 140 1295
2018-01-01 09:00:00 112 1 143 130 1298
2018-01-01 10:00:00 109 10 185 137 1647
2018-01-01 11:00:00 98 11 146 105 1331
2018-01-01 12:00:00 83 3 214 131 1355
Basically, trying to select the 3 rows with 9:00:00, 10:00:00 and 11:00:00 times in the dataframe.
Is using the .loc the best way to do this?
Thank you.
Upvotes: 3
Views: 7825
Reputation: 6159
If you have multiple dates
you can get it by converting
to DateTime
and filter
df["date"]=pd.to_datetime(df["date"])
df[df["date"].between('2018-01-01 09:00:00','2018-01-01 11:00:00')]
Upvotes: 5
Reputation: 1531
df.loc is used for label based rows filtering. when your dataframe have labels you can use for filtering or by default it starts 0 to n.
dt_range = pd.date_range('2018-01-01 09:00:00', periods=3, freq='1H')
stockprices['dt'] = pd.to_datetime(stockprices['date'])
result = stockprices[stockprices['dt'].apply(lambda x: dt_range[0] <= x and dt_range[len(dt_range)-1] >= x)]
print("result df" , result )
Upvotes: 0
Reputation: 11192
try this,
df['date']=pd.to_datetime(df['date'])
df[df['date'].dt.hour.between(9,11)]
Output:
date msft nok aapl ibm amzn
1 2018-01-01 09:00:00 112 1 143 130 1298
2 2018-01-01 10:00:00 109 10 185 137 1647
3 2018-01-01 11:00:00 98 11 146 105 1331
use between
and hour
to slice your desired dataframe.
Upvotes: 1
Reputation: 51165
pd.date_range
i = pd.date_range('2018-01-01 09:00:00', periods=3, freq='1H')
# Result
DatetimeIndex(['2018-01-01 09:00:00', '2018-01-01 10:00:00',
'2018-01-01 11:00:00'],
dtype='datetime64[ns]', freq='H')
to_datetime
df['date'] = pd.to_datetime(df.date)
loc
df.loc[(df.date >= i[0]) & (df.date <= i[-1])]
date msft nok aapl ibm amzn
1 2018-01-01 09:00:00 112 1 143 130 1298
2 2018-01-01 10:00:00 109 10 185 137 1647
3 2018-01-01 11:00:00 98 11 146 105 1331
Upvotes: 1