Aggregate time series with group by and create chart with multiple series

Question

I have time series data and I want to create a chart of the monthly (x-axis) counts of the number of records (lines chart), grouped by sentiment (multiple lines)

Data looks like this

created_at                         id                   polarity  sentiment  
0  Fri Nov 02 11:22:47 +0000 2018  1058318498663870464  0.000000   neutral   
1  Fri Nov 02 11:20:54 +0000 2018  1058318026758598656  0.011905   neutral   
2  Fri Nov 02 09:41:37 +0000 2018  1058293038739607552  0.800000  positive   
3  Fri Nov 02 09:40:48 +0000 2018  1058292834699231233  0.800000  positive   
4  Thu Nov 01 18:23:17 +0000 2018  1058061933243518976  0.233333   neutral   
5  Thu Nov 01 17:50:39 +0000 2018  1058053723157618690  0.400000  positive   
6  Wed Oct 31 18:57:53 +0000 2018  1057708251758903296  0.566667  positive   
7  Sun Oct 28 17:21:24 +0000 2018  1056596810570100736  0.000000   neutral   
8  Sun Oct 21 13:00:53 +0000 2018  1053994531845296128  0.136364   neutral   
9  Sun Oct 21 12:55:12 +0000 2018  1053993101205868544  0.083333   neutral

So far I have managed to aggregate to the monthly totals, with the following code:

import pandas as pd

tweets = process_twitter_json(file_name) 
#print(tweets[:10])

df = pd.DataFrame.from_records(tweets)
print(df.head(10))

#make the string date into a date field    
df['tweet_datetime'] = pd.to_datetime(df['created_at'])
df.index = df['tweet_datetime']

#print('Monthly counts')
monthly_sentiment = df.groupby('sentiment')['tweet_datetime'].resample('M').count()

I'm struggling with how to chart the data.

Do I pivot to turn each of the discreet values within the sentiment field as separate columns
I've tried .unstack() that turns the sentiment values into rows, which is almost there, but the problem is dates become string column headers, which is no good for charting

Aggregate time series with group by and create chart with multiple series

Answers (1)

Related Questions