Reputation: 129
I have a very large dataset : https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption
It contains around 2.5M rows. The Pandas dataframe index is a timestamp and then it has several columns.
I want to filter the dataset so I only see, for instance, 9AM (09:00:00) rows only for all years (around 1400 rows aprox ->365*4)
I have tried this:
dataset.groupby(dataset.index.hour == '09:00:00')
But it doesn't work. I have also tried without sucess this:
dataset['09:00:00']
Thanks
Upvotes: 3
Views: 5320
Reputation: 11105
Your two attempts are close! It should be possible to select desired rows using a boolean mask as follows:
dataset[dataset.index.hour == 9]
Upvotes: 4