Alicia_2024
Alicia_2024

Reputation: 531

How to calculate harmonic mean in pandas

I have a data frame that looks like below. Wordsrefers to the number of words per email sent.

sender receiver words 
a        b       10
a        c       5
a        c       15
b        a       50
b        a       30

I'm relatively new to Pandas. I'd like to calculate the harmonic mean of 1)the number of emails sent between each pair 2) total number of words sent between two people. How do I use hmean() from scipy.stats to obtain the desired output?

sender  receiver  total_emails  total_words
   a        b                   hmean([10])
   a        c                   hmean([5,15])
   b        a                   hmean([50,30])

For the total number of emails, I am not sure what should be the correct formula. Any help would be appreciated!

Upvotes: 4

Views: 1358

Answers (1)

Nk03
Nk03

Reputation: 14949

you can use groupby:

from scipy import stats
df = df.groupby(['sender', 'receiver']).agg(stats.hmean).reset_index(name='total_words')

OUTPUT:

  sender receiver  total_words
0      a        b         10.0
1      a        c          7.5
2      b        a         37.5

Upvotes: 5

Related Questions