Reputation: 21
I have a dataframe like this:
dT_sampleTime steps
0 0.002 0.001
1 0.004 0.002
2 0.004 0.003
3 0.004 0.004
4 0.003 0.005
5 0.007 0.006
6 0.001 0.007
and I want to count how often the value of steps occurs in the column dT_sampleTime and create a new column absolute frequency.
dT_sampleTime steps absolute frequency
0 0.002 0.001 1
1 0.004 0.002 1
2 0.004 0.003 1
3 0.004 0.004 3
4 0.003 0.005 0
5 0.007 0.006 0
6 0.001 0.007 1
My idea was something like this:
df['absolute frequency'] = df.groupby(df['steps'],df['dT_sampleTime']).count
Upvotes: 1
Views: 132
Reputation: 1557
df
steps
value of each row as a filter applied to the dT_sampleTime
columnsteps
value within the dt_sampleTime
columnabsolute frequency
fieldfor i, row in df.iterrows():
df.loc[i, 'absolute frequency'] = len(df[df['dT_sampleTime'] == row['steps']])
Resulting df
based on the example given in your original question:
dT_sampleTime steps absolute frequency
0 0.002 0.001 1.0
1 0.004 0.002 1.0
2 0.004 0.003 1.0
3 0.004 0.004 3.0
4 0.003 0.005 0.0
5 0.007 0.006 0.0
6 0.001 0.007 1.0
I'm not sure that this is the most efficient way to achieve your ends, however it does work quite well and should be suitable for your purpose. Happy to take feedback on this from anybody if they know better and would be so kind.
Upvotes: 0
Reputation: 59549
map
the 'steps' column with the value_counts
of the 'dt_sampleTime' column. Then fill the missing values with 0.
df['absolute frequency'] = (df['steps'].map(df['dT_sampleTime'].value_counts())
.fillna(0, downcast='infer'))
# dT_sampleTime steps absolute frequency
#0 0.002 0.001 1
#1 0.004 0.002 1
#2 0.004 0.003 1
#3 0.004 0.004 3
#4 0.003 0.005 0
#5 0.007 0.006 0
#6 0.001 0.007 1
When mapping with a Series it uses the index to look up the appropriate value. The value_counts Series is
df['dT_sampleTime'].value_counts()
#0.004 3
#0.007 1
#0.001 1
#0.002 1
#0.003 1
#Name: dT_sampleTime, dtype: int64
so 0.004
in the steps columns goes to 3
, for instance.
Upvotes: 1