yasgur99
yasgur99

Reputation: 858

Find percentile of all elements in a list

I have a sorted list, for example,

S = [0, 10.2, 345.9, ...]

If S is large (500k+ elements), what is the best way to find what percentile each element belongs to?

My goal is to store in database table that looks something like this:

Svalue | Percentile
-------------------
0      |     a
10.2.  |     b
345.9  |     c
...    |    ...

Upvotes: 0

Views: 275

Answers (2)

hello_friend
hello_friend

Reputation: 5788

Solution:

# Import and initialise pandas into session: 
import pandas as pd

# Store a scalar of the length of the list: list_length => list
list_length = len(S)

# Use a list comprehension to retrieve the indices of each element: idx => list
idx = [index for index, value in enumerate(S)]

# Divide each of the indices by the list_length scalar using a list 
# comprehension: percentile_rank => list
percentile_rank = [el / list_length for el in idx]

# Column bind separate lists into a single DataFrame in order to achieved desired format: df => pd.DataFrame
df = pd.DataFrame({"Svalue": S,  "Percentile": percentile_rank}) 

# Send the first 6 rows to console: stdout
df.head()

Data:

# Ensure list is sorted: S => list
S = sorted([0, 10.2, 345.9])

# Print the result: stdout
print(S)

Upvotes: 1

Vanshika
Vanshika

Reputation: 153

Try pandas rank

import pandas as pd

df = pd.DataFrame()
df["Svalue"] = S
df["Percentile"] = df["Svalue"].rank(pct=True)

Upvotes: 3

Related Questions