La_haine
La_haine

Reputation: 349

Pandas timestamp

I'd like to group my data per day and calculate the daily mean of the sentiment.

I have problem with the pandas dataframe because I am not able to transform my date column in datestamp to use the groupby() function. Here is my data sample:

   sentiment              date
0  1  2018-01-01 07:37:07+00:00
1  0  2018-02-12 06:57:27+00:00
2  -1  2018-09-18 06:23:07+00:00
3  1 2018-09-18 07:23:10+00:00
4  0  2018-02-12 06:21:08+00:00

Upvotes: 2

Views: 45

Answers (1)

jezrael
jezrael

Reputation: 863731

I think need resample - it create full DatatimeIndex:

df['date'] = pd.to_datetime(df['date'])

df1 = df.resample('D',on='date')['sentiment'].mean()
#if want remove NaNs rows
df1 = df.resample('D',on='date')['sentiment'].mean().dropna()

Or groupby and aggregate mean with dates or floor for remove times:

df2 = df.groupby(df['date'].dt.date)['sentiment'].mean()
#DatetimeIndex in output
df2 = df.groupby(df['date'].dt.floor('d'))['sentiment'].mean()

Upvotes: 1

Related Questions