Michael
Michael

Reputation: 308

Create pandas timeseries with column frequencies

I am trying to convert a dataframe I have into a timeseries, which holds the frequency of True values for each column. The dataframe looks something like this:

    CREATED             A       B       C
0   2018-05-15 02:44:30 True    False   False
1   2018-05-15 06:57:18 True    False   False
2   2018-05-21 19:05:14 False   False   True
3   2018-05-21 10:15:07 False   True    False
4   2018-06-15 08:07:05 False   False   True
5   2018-06-15 14:07:05 True    False   True

For each row, only one of the columns is True, all other are False. I don't need the times, just the dates. The column CREATED is of type datetime64[ns]. What I want is to prepare the dataframe to be able to plot a timeseries with lines for each column representing the frequency for each day/week/month. The result I am have in my mind looks like this:

CREATED     A   B   C
2018-05-15  2   0   0
2018-05-21  0   1   1
2018-06-15  1   0   1

I tried to set the CREATED column as the index and resample as days, but I can't figure out how to concatenate everything to get the frequencies.

Upvotes: 0

Views: 29

Answers (1)

Sayandip Dutta
Sayandip Dutta

Reputation: 15872

Considering CREATED is a datetime column, you can use dt, the datetime accessor, to access date like series.dt.date:

>>> df.groupby(df['CREATED'].dt.date).sum().reset_index()

      CREATED  A  B  C
0  2018-05-15  2  0  0
1  2018-05-21  0  1  1
2  2018-06-15  1  0  2

Otherwise, you can first convert CREATED to datetime column:

>>> df['CREATED'] = pd.to_datetime(df['CREATED'])
>>> df['CREATED'].dt.date

0    2018-05-15
1    2018-05-15
2    2018-05-21
3    2018-05-21
4    2018-06-15
5    2018-06-15
Name: CREATED, dtype: object

Upvotes: 3

Related Questions