Reputation: 516
I have a dataframe with MultiIndex columns. I want to filter the data using the columns of the dataset. When I try df.columns
I get this information:
MultiIndex(levels=[['power'], ['active']],
codes=[[0], [0]],
names=['physical_quantity', 'type'])
A short description of the dataset is:
physical_quantity power
type active
2011-04-18 09:22:13-04:00 6.0
2011-04-18 09:22:16-04:00 6.0
2011-04-18 09:22:20-04:00 6.0
2011-04-18 09:22:23-04:00 6.0
2011-04-18 09:22:26-04:00 6.0
The first thing I found is that although you see two columns there, the dataframe really says that it is a dataset of [529757 rows x 1 columns]
.
What I want to do is to filter the data selecting an interval of time, choosing the first column called physical_quantity type.
On the other hand, the data of that first column (physical_quantity type) it is unknown:
physical_quantity type
power active float32
dtype: object
Checking with df.index
I managed to see this information about the dataframe:
DatetimeIndex(['2011-04-18 09:22:13-04:00', '2011-04-18 09:22:16-04:00',
'2011-04-18 09:22:20-04:00', '2011-04-18 09:22:23-04:00',
'2011-04-18 09:22:26-04:00', '2011-04-18 09:22:30-04:00',
'2011-04-18 09:22:33-04:00', '2011-04-18 09:22:37-04:00',
'2011-04-18 09:22:40-04:00', '2011-04-18 09:22:44-04:00',
...
'2011-05-14 23:59:26-04:00', '2011-05-14 23:59:29-04:00',
'2011-05-14 23:59:33-04:00', '2011-05-14 23:59:36-04:00',
'2011-05-14 23:59:40-04:00', '2011-05-14 23:59:43-04:00',
'2011-05-14 23:59:46-04:00', '2011-05-14 23:59:50-04:00',
'2011-05-14 23:59:53-04:00', '2011-05-14 23:59:57-04:00'],
dtype='datetime64[ns, US/Eastern]', length=529757, freq=None)
So I understood that the data of that column is something like dtype='datetime64[ns, US/Eastern]
So I aim to slice the data, from an specific day and hour to another day and hour.
from 2011-05-10 19:44:51-04:00 to 2011-05-10 23:17:59-04:00
I tried doing something like this:
df[df['physical_quantity', 'type']] > 2011-05-10 19:44:51-04:00
&
df[df['physical_quantity', 'type']] < 2011-05-10 23:17:59-04:00
df[df['physical_quantity', 'type']] > 2011-05-10 19:44:51-04:00
File "<ipython-input-133-27848c7d6afc>", line 1
df[df['physical_quantity', 'type']] > 2011-05-10 19:44:51-04:00
^
SyntaxError: invalid token
How can I solve my problem?
Upvotes: 0
Views: 36
Reputation: 356
try this
df['ts'] = pd.to_datetime(df["ts"], unit="ms")```
Upvotes: 1