Reputation: 21

Count values in column and assign to row

I have a dataframe like this:

    dT_sampleTime   steps      
0        0.002      0.001
1        0.004      0.002
2        0.004      0.003  
3        0.004      0.004  
4        0.003      0.005  
5        0.007      0.006
6        0.001      0.007

and I want to count how often the value of steps occurs in the column dT_sampleTime and create a new column absolute frequency.

    dT_sampleTime   steps    absolute frequency    
0       0.002       0.001            1              
1       0.004       0.002            1
2       0.004       0.003            1
3       0.004       0.004            3
4       0.003       0.005            0
5       0.007       0.006            0
6       0.001       0.007            1

My idea was something like this:

df['absolute frequency'] = df.groupby(df['steps'],df['dT_sampleTime']).count

Upvotes: 1

Answers (2)

JPI93

Reputation: 1557

Loop across df
Use the steps value of each row as a filter applied to the dT_sampleTime column
The number of rows in the resultant DataFrame is the absolute frequency of the current steps value within the dt_sampleTime column
Append this value to the current row under the absolute frequency field

for i, row in df.iterrows():
     df.loc[i, 'absolute frequency'] = len(df[df['dT_sampleTime'] == row['steps']])

Resulting df based on the example given in your original question:

   dT_sampleTime  steps  absolute frequency
0          0.002  0.001                 1.0
1          0.004  0.002                 1.0
2          0.004  0.003                 1.0
3          0.004  0.004                 3.0
4          0.003  0.005                 0.0
5          0.007  0.006                 0.0
6          0.001  0.007                 1.0

I'm not sure that this is the most efficient way to achieve your ends, however it does work quite well and should be suitable for your purpose. Happy to take feedback on this from anybody if they know better and would be so kind.

Upvotes: 0

ALollz

Reputation: 59549

map the 'steps' column with the value_counts of the 'dt_sampleTime' column. Then fill the missing values with 0.

df['absolute frequency'] = (df['steps'].map(df['dT_sampleTime'].value_counts())
                                       .fillna(0, downcast='infer'))

#   dT_sampleTime  steps  absolute frequency
#0          0.002  0.001                   1
#1          0.004  0.002                   1
#2          0.004  0.003                   1
#3          0.004  0.004                   3
#4          0.003  0.005                   0
#5          0.007  0.006                   0
#6          0.001  0.007                   1

When mapping with a Series it uses the index to look up the appropriate value. The value_counts Series is

df['dT_sampleTime'].value_counts()
#0.004    3
#0.007    1
#0.001    1
#0.002    1
#0.003    1
#Name: dT_sampleTime, dtype: int64

so 0.004 in the steps columns goes to 3, for instance.

Upvotes: 1

Count values in column and assign to row

Answers (2)

Related Questions