Reputation: 308
I am trying to convert a dataframe I have into a timeseries, which holds the frequency of True
values for each column. The dataframe looks something like this:
CREATED A B C
0 2018-05-15 02:44:30 True False False
1 2018-05-15 06:57:18 True False False
2 2018-05-21 19:05:14 False False True
3 2018-05-21 10:15:07 False True False
4 2018-06-15 08:07:05 False False True
5 2018-06-15 14:07:05 True False True
For each row, only one of the columns is True
, all other are False
. I don't need the times, just the dates. The column CREATED is of type datetime64[ns]
.
What I want is to prepare the dataframe to be able to plot a timeseries with lines for each column representing the frequency for each day/week/month.
The result I am have in my mind looks like this:
CREATED A B C
2018-05-15 2 0 0
2018-05-21 0 1 1
2018-06-15 1 0 1
I tried to set the CREATED column as the index and resample as days, but I can't figure out how to concatenate everything to get the frequencies.
Upvotes: 0
Views: 29
Reputation: 15872
Considering CREATED
is a datetime
column, you can use dt
, the datetime accessor, to access date
like series.dt.date
:
>>> df.groupby(df['CREATED'].dt.date).sum().reset_index()
CREATED A B C
0 2018-05-15 2 0 0
1 2018-05-21 0 1 1
2 2018-06-15 1 0 2
Otherwise, you can first convert CREATED
to datetime column:
>>> df['CREATED'] = pd.to_datetime(df['CREATED'])
>>> df['CREATED'].dt.date
0 2018-05-15
1 2018-05-15
2 2018-05-21
3 2018-05-21
4 2018-06-15
5 2018-06-15
Name: CREATED, dtype: object
Upvotes: 3