Neil
Neil

Reputation: 8247

subsetting pandas dataframe on specific date value

I have a pandas dataframe like this

order_id buyer_id item_id                time        
537       79      93 2016-01-04 10:20:00    
540      191      93 2016-01-04 10:30:00    
556      251      82 2016-01-04 13:39:00  
589      191     104 2016-01-05 10:59:00   
596      251      99 2016-01-05 13:48:00    
609       79     106 2016-01-06 10:39:00    
611      261      97 2016-01-06 10:50:00   
680       64     135 2016-01-11 11:58:00  
681      261     133 2016-01-11 12:03:00    
682      309     135 2016-01-11 12:08:00 

I want to subset this dataframe on date == '2016-01-04.Datatypes of df dataframe are

df.dtypes Out[1264]: order_id object buyer_id object item_id object time datetime64[ns]

This is what I am doing in python

df[df['time'] == '2016-01-04']

But it returns me an empty dataframe. But,when I do df[df['time'] < '2016-01-05'] it works. Please help

Upvotes: 1

Views: 392

Answers (2)

EdChum
EdChum

Reputation: 394389

The problem here is that the comparison is being performed for an exact match, as none of the times are '00:00:00' then no matches occur, you'd have to compare just the date components in order for this to work:

In [20]:
df[df['time'].dt.date == pd.to_datetime('2016-01-04').date()]

Out[20]:
   order_id  buyer_id  item_id                time
0       537        79       93 2016-01-04 10:20:00
1       540       191       93 2016-01-04 10:30:00
2       556       251       82 2016-01-04 13:39:00

Upvotes: 1

jezrael
jezrael

Reputation: 863501

IIUC you can use DatetimeIndex Partial String Indexing:

print df
   order_id  buyer_id  item_id                time
0       537        79       93 2016-01-04 10:20:00
1       540       191       93 2016-01-04 10:30:00
2       556       251       82 2016-01-04 13:39:00
3       589       191      104 2016-01-05 10:59:00
4       596       251       99 2016-01-05 13:48:00
5       609        79      106 2016-01-06 10:39:00
6       611       261       97 2016-01-06 10:50:00
7       680        64      135 2016-01-11 11:58:00
8       681       261      133 2016-01-11 12:03:00
9       682       309      135 2016-01-11 12:08:00

df = df.set_index('time')
print df['2016-01-04']
                     order_id  buyer_id  item_id
time                                            
2016-01-04 10:20:00       537        79       93
2016-01-04 10:30:00       540       191       93
2016-01-04 13:39:00       556       251       82

Upvotes: 1

Related Questions