Student
Student

Reputation: 137

Combining datetime column in to a mean count by date column

I have a dataset that is formatted in the below manner, and I'm trying to reformat it in a way that instead of DateTime column, id get the mean order count for each day (Monday,Tuesday, etc.)

TIMESTAMP TEMPERATURE WINDSPEED
2020-08-01 13.2 4.9
2020-08-01 15 5
2020-08-02 16 2.4
2020-08-02 14.2 6.3
2020-09-10 17.5 2
2020-09-10 9 8.3

Heres my code so far, everything seems to work fine and I can print the mean order count by day, by itself, but when trying to import it into the data set the ordercount is Nan

df = pd.read_csv('orders_autumn_2020.csv')

df['TIMESTAMP']= pd.to_datetime(df['TIMESTAMP'])

df_mod = df.groupby(df['TIMESTAMP'].dt.weekday).mean()
datecount = df.resample('D', on='TIMESTAMP').count()
ORDCOUNT = (datecount['WINDSPEED'])
df_mod["ORDCOUNT"] = ORDCOUNT
df_mod = df_mod[["TEMPERATURE","WIND_SPEED","ORDCOUNT"]]

print(df_mod)
TIMESTAMP TEMPERATURE WINDSPEED ORDCOUNT
0 17.055038 4.027295 NaN
1 15.961699 2.951472 NaN
2 16.305026 3.600513 NaN
3 16.142084 4.051359 NaN
4 16.864189 3.131984 NaN
5 17.364454 4.230898 NaN
6 18.321807 4.310171 NaN

Upvotes: 1

Views: 66

Answers (1)

jezrael
jezrael

Reputation: 862731

In your solution aggregate by 2 different values - by weekday and by D for days, so indices are different so if assign column get NaNs.

Possible solution if need count per weekday with omit NaNs if exist in WINDSPEED column with GroupBy.count:

df1 = (df.groupby(df['TIMESTAMP'].dt.weekday)
         .agg(TEMPERATURE = ('TEMPERATURE','mean'),
              WINDSPEED = ('WINDSPEED','mean'),
              ORDCOUNT = ('WINDSPEED','count')))

Or if need count per days use Grouper:

df2 = (df.groupby(pd.Grouper(freq='d', key='TIMESTAMP'))
         .agg(TEMPERATURE = ('TEMPERATURE','mean'),
              WINDSPEED = ('WINDSPEED','mean'),
              ORDCOUNT = ('WINDSPEED','count')))

Your solution:

df_mod = df.groupby(df['TIMESTAMP'].dt.weekday).mean()
datecount = df.resample('D', on='TIMESTAMP')['WINDSPEED'].count()

df_mod["ORDCOUNT"] = datecount.groupby(datecount.index.weekday).sum()

Upvotes: 2

Related Questions