user3668129
user3668129

Reputation: 4810

How to get number of element in hour

I'm running with python 3.7.6.

I have a table as follow:

    f_value        desc      time

1    f1            null       1583591256000
2    f2            null       1583591356000
3    f1            TBD1       1583591256000
4    f3            TBD2       1583591259000
5    f4            TBD4       1583591256000 
6    f1            TBD5       1583592256000 
7    f1            null       1583593256000 
8    f1            TBD5       1583592256000 
9    f2            TBD5       1583592252000 
10   f1            TBD5       1583592956000 

The time column represents time in milliseconds from 1970.

I want to group by f_value and get the average number of rows per day.

How can I do it ?

How can we get the day from the time field and get the average of instances for days per f_value group ?

I want to see the results in a new column with the average counts per day

Upvotes: 0

Views: 56

Answers (2)

jezrael
jezrael

Reputation: 862511

If want avverage datetime per f_value column in new column use GroupBy.transform with mean and then convert to datetimes:

df['new'] = pd.to_datetime(df.groupby('f_value')['time'].transform('mean'), unit='ms')
print (df)
   f_value  desc           time                 new
1       f1   NaN  1583591256000 2020-03-07 14:43:26
2       f2   NaN  1583591356000 2020-03-07 14:36:44
3       f1  TBD1  1583591256000 2020-03-07 14:43:26
4       f3  TBD2  1583591259000 2020-03-07 14:27:39
5       f4  TBD4  1583591256000 2020-03-07 14:27:36
6       f1  TBD5  1583592256000 2020-03-07 14:43:26
7       f1   NaN  1583593256000 2020-03-07 14:43:26
8       f1  TBD5  1583592256000 2020-03-07 14:43:26
9       f2  TBD5  1583592252000 2020-03-07 14:36:44
10      f1  TBD5  1583592956000 2020-03-07 14:43:26

If need number of count per days in new column use GroupBy.size with Series.dt.date:

g = pd.to_datetime(df['time'], unit='ms').dt.date
df['new'] = df.groupby(['f_value', g])['time'].transform('size')
print (df)
   f_value  desc           time  new
1       f1   NaN  1583591256000    6
2       f2   NaN  1583591356000    2
3       f1  TBD1  1583591256000    6
4       f3  TBD2  1583591259000    1
5       f4  TBD4  1583591256000    1
6       f1  TBD5  1583592256000    6
7       f1   NaN  1583593256000    6
8       f1  TBD5  1583592256000    6
9       f2  TBD5  1583592252000    2
10      f1  TBD5  1583592956000    6

Upvotes: 2

Quang Hoang
Quang Hoang

Reputation: 150735

IIUC:

# extract the dates
dates = pd.to_datetime(df['time'], unit='ms').dt.floor('D')

(df.groupby(dates)               # groupby dates
   .f_value.value_counts()       # count the f_value by date and type
   .groupby('f_value').mean()    # average counts across dates
)

Output:

f_value
f1    6
f2    2
f3    1
f4    1
Name: f_value, dtype: int64

Upvotes: 0

Related Questions