Calculating percentile use pandas

Question

I have one dataframe I am looping through, grabbing information from it and then using that information to find some metrics. I have something like

dataframe 1:

|   student 1     |   student 2    |
|   kate          |   john         |
|   david         |   kelly        |

dataframe 2:

|   student       |       A      |       B      |
|   kate          |       17     |       8      |
|   david         |       20     |       15     |
|   john          |       17     |       40     |

Basically I would grab the name kate and John. I would then loop through dataframe 2 and look for those two students. I then want to find the percentile where they sit for columns A and B. I have done something like:

perc = stats.percentileofscore(student1Info[1],data['A'] , 'rank')

where student1Info[1] holds 17 (Kate's value in column A)

but it results in the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I would appreciate any advice. Also, can I use something similar to find the percentile of a datetime. For example, I have a bunch of submission times for each student and I want to find what percentile a student submission time sits.

Thanks!!

Mykola Zotko · Accepted Answer

In the function scipy.stats.percentilieofscore you need to use an array as the first and a score as the second argument:

perc = stats.percentileofscore(data['A'], data.loc['kate', 'A'])

scipy.stats.percentileofscore(a, score, kind='rank')

Calculating percentile use pandas

Answers (1)

Related Questions