Ann Kriebel
Ann Kriebel

Reputation: 109

How to select dataframe rows between two datetimes?

I am having trouble selecting rows from a dataframe with a datetime column from a CSV file. I have my code and my sample data. There are no rows returned.

import pandas as pd
import numpy as np
col_names = ['date','msft','nok','aapl','ibm','amzn']
stockprices = pd.read_csv('./stockdata.csv', skiprows=1, names=col_names)
stockprices.loc['2018-01-01 9:00:00':'2018-01-01 11:00:00']

CSV file data:

        date               msft   nok   aapl   ibm  amzn 
    2018-01-01 08:00:00     107     2   161    140  1295
    2018-01-01 09:00:00     112     1   143    130  1298 
    2018-01-01 10:00:00     109    10   185    137  1647 
    2018-01-01 11:00:00      98    11   146    105  1331
    2018-01-01 12:00:00      83    3    214    131  1355

Basically, trying to select the 3 rows with 9:00:00, 10:00:00 and 11:00:00 times in the dataframe.

Is using the .loc the best way to do this?

Thank you.

Upvotes: 3

Views: 7825

Answers (4)

Pyd
Pyd

Reputation: 6159

If you have multiple dates

you can get it by converting to DateTime and filter

df["date"]=pd.to_datetime(df["date"])
df[df["date"].between('2018-01-01 09:00:00','2018-01-01 11:00:00')]

Upvotes: 5

Narendra Prasath
Narendra Prasath

Reputation: 1531

df.loc is used for label based rows filtering. when your dataframe have labels you can use for filtering or by default it starts 0 to n.

dt_range = pd.date_range('2018-01-01 09:00:00', periods=3, freq='1H')
stockprices['dt'] = pd.to_datetime(stockprices['date'])

result = stockprices[stockprices['dt'].apply(lambda x: dt_range[0] <= x  and  dt_range[len(dt_range)-1] >= x)]
print("result df" , result )

Upvotes: 0

Mohamed Thasin ah
Mohamed Thasin ah

Reputation: 11192

try this,

df['date']=pd.to_datetime(df['date'])
df[df['date'].dt.hour.between(9,11)]

Output:

                 date  msft  nok  aapl  ibm  amzn 
1 2018-01-01 09:00:00   112    1   143  130   1298
2 2018-01-01 10:00:00   109   10   185  137   1647
3 2018-01-01 11:00:00    98   11   146  105   1331

use between and hour to slice your desired dataframe.

Upvotes: 1

user3483203
user3483203

Reputation: 51165

pd.date_range

i = pd.date_range('2018-01-01 09:00:00', periods=3, freq='1H')

# Result
DatetimeIndex(['2018-01-01 09:00:00', '2018-01-01 10:00:00',
           '2018-01-01 11:00:00'],
          dtype='datetime64[ns]', freq='H')

to_datetime

df['date'] = pd.to_datetime(df.date)

loc

df.loc[(df.date >= i[0]) & (df.date <= i[-1])]

                 date  msft  nok  aapl  ibm  amzn
1 2018-01-01 09:00:00   112    1   143  130  1298
2 2018-01-01 10:00:00   109   10   185  137  1647
3 2018-01-01 11:00:00    98   11   146  105  1331

Upvotes: 1

Related Questions