OfficerTryhard
OfficerTryhard

Reputation: 21

Count values in column and assign to row

I have a dataframe like this:

    dT_sampleTime   steps      
0        0.002      0.001
1        0.004      0.002
2        0.004      0.003  
3        0.004      0.004  
4        0.003      0.005  
5        0.007      0.006
6        0.001      0.007 

and I want to count how often the value of steps occurs in the column dT_sampleTime and create a new column absolute frequency.

    dT_sampleTime   steps    absolute frequency    
0       0.002       0.001            1              
1       0.004       0.002            1
2       0.004       0.003            1
3       0.004       0.004            3
4       0.003       0.005            0
5       0.007       0.006            0
6       0.001       0.007            1

My idea was something like this:

df['absolute frequency'] = df.groupby(df['steps'],df['dT_sampleTime']).count

Upvotes: 1

Views: 132

Answers (2)

JPI93
JPI93

Reputation: 1557

  • Loop across df
  • Use the steps value of each row as a filter applied to the dT_sampleTime column
  • The number of rows in the resultant DataFrame is the absolute frequency of the current steps value within the dt_sampleTime column
  • Append this value to the current row under the absolute frequency field
for i, row in df.iterrows():
     df.loc[i, 'absolute frequency'] = len(df[df['dT_sampleTime'] == row['steps']])

Resulting df based on the example given in your original question:

   dT_sampleTime  steps  absolute frequency
0          0.002  0.001                 1.0
1          0.004  0.002                 1.0
2          0.004  0.003                 1.0
3          0.004  0.004                 3.0
4          0.003  0.005                 0.0
5          0.007  0.006                 0.0
6          0.001  0.007                 1.0

I'm not sure that this is the most efficient way to achieve your ends, however it does work quite well and should be suitable for your purpose. Happy to take feedback on this from anybody if they know better and would be so kind.

Upvotes: 0

ALollz
ALollz

Reputation: 59549

map the 'steps' column with the value_counts of the 'dt_sampleTime' column. Then fill the missing values with 0.

df['absolute frequency'] = (df['steps'].map(df['dT_sampleTime'].value_counts())
                                       .fillna(0, downcast='infer'))

#   dT_sampleTime  steps  absolute frequency
#0          0.002  0.001                   1
#1          0.004  0.002                   1
#2          0.004  0.003                   1
#3          0.004  0.004                   3
#4          0.003  0.005                   0
#5          0.007  0.006                   0
#6          0.001  0.007                   1

When mapping with a Series it uses the index to look up the appropriate value. The value_counts Series is

df['dT_sampleTime'].value_counts()
#0.004    3
#0.007    1
#0.001    1
#0.002    1
#0.003    1
#Name: dT_sampleTime, dtype: int64

so 0.004 in the steps columns goes to 3, for instance.

Upvotes: 1

Related Questions