Reputation: 4810
I'm running with python 3.7.6.
I have a table as follow:
f_value desc time
1 f1 null 1583591256000
2 f2 null 1583591356000
3 f1 TBD1 1583591256000
4 f3 TBD2 1583591259000
5 f4 TBD4 1583591256000
6 f1 TBD5 1583592256000
7 f1 null 1583593256000
8 f1 TBD5 1583592256000
9 f2 TBD5 1583592252000
10 f1 TBD5 1583592956000
The time column represents time in milliseconds from 1970.
I want to group by f_value
and get the average number of rows per day.
How can I do it ?
How can we get the day from the time
field and get the average of instances for days per f_value
group ?
I want to see the results in a new column with the average counts per day
Upvotes: 0
Views: 56
Reputation: 862511
If want avverage datetime per f_value
column in new column use GroupBy.transform
with mean
and then convert to datetimes:
df['new'] = pd.to_datetime(df.groupby('f_value')['time'].transform('mean'), unit='ms')
print (df)
f_value desc time new
1 f1 NaN 1583591256000 2020-03-07 14:43:26
2 f2 NaN 1583591356000 2020-03-07 14:36:44
3 f1 TBD1 1583591256000 2020-03-07 14:43:26
4 f3 TBD2 1583591259000 2020-03-07 14:27:39
5 f4 TBD4 1583591256000 2020-03-07 14:27:36
6 f1 TBD5 1583592256000 2020-03-07 14:43:26
7 f1 NaN 1583593256000 2020-03-07 14:43:26
8 f1 TBD5 1583592256000 2020-03-07 14:43:26
9 f2 TBD5 1583592252000 2020-03-07 14:36:44
10 f1 TBD5 1583592956000 2020-03-07 14:43:26
If need number of count per days in new column use GroupBy.size
with Series.dt.date
:
g = pd.to_datetime(df['time'], unit='ms').dt.date
df['new'] = df.groupby(['f_value', g])['time'].transform('size')
print (df)
f_value desc time new
1 f1 NaN 1583591256000 6
2 f2 NaN 1583591356000 2
3 f1 TBD1 1583591256000 6
4 f3 TBD2 1583591259000 1
5 f4 TBD4 1583591256000 1
6 f1 TBD5 1583592256000 6
7 f1 NaN 1583593256000 6
8 f1 TBD5 1583592256000 6
9 f2 TBD5 1583592252000 2
10 f1 TBD5 1583592956000 6
Upvotes: 2
Reputation: 150735
IIUC:
# extract the dates
dates = pd.to_datetime(df['time'], unit='ms').dt.floor('D')
(df.groupby(dates) # groupby dates
.f_value.value_counts() # count the f_value by date and type
.groupby('f_value').mean() # average counts across dates
)
Output:
f_value
f1 6
f2 2
f3 1
f4 1
Name: f_value, dtype: int64
Upvotes: 0