Reputation: 6149
I need to compute the relative percentage of each category using pandas, I know I need to use groupby using pandas but I am kinda lost.
Input:
ID | stringValue | FloatValue
A | 'string' | 2
A | 'string2' | 8
B | 'string' | 5
Expected Output:
ID | stringValue | FloatValue | Perc
A | 'string' | 2 | 20
A | 'string2' | 8 | 80
B | 'string' | 5 | 100
The expected output groups value by their ID and calculate the percrentage.
Here, in A you have two value 2
and 8
. So the percentage should be 100 * 2 / (2+8)
and 100 * 8 / (2+8)
. For the id B, there is only one value so the Perc should be 100
Upvotes: 0
Views: 593
Reputation: 11
Considering your data is a pandas DataFrame named "data", the following code should do the trick:
data["Perc"] = data.apply(lambda x: x["FloatValue"] * 100 / data.groupby(["ID"]).sum()["FloatValue"][x["ID"]], axis=1)
It groups your items by ID and compute the total sum of FloatValue. The apply method of DataFrame then create a new Series by dividing the FloatValue by the corresponding group sum.
Upvotes: 1
Reputation: 4233
IIUC try:
df['Perc'] = df.groupby('ID')['FloatValue'].transform(lambda x: (x/x.sum()) * 100)
# Output
ID stringValue FloatValue Perc
0 A 'string' 2 20
1 A 'string2' 8 80
2 B 'string' 5 100
Upvotes: 1
Reputation: 1145
If the 5 -> 100% is a typo and you literally just mean how can I make my number look like a percentage, you can do that easily.
If you want it to be % literally then you will have to divide by 10, and you can print it by multiplying by 100.
If you want it to be a percentage and you will keep that in mind, then do the opposite and multiply by 10.
You can do this by simply doing:
df["Perc"] = df["FloatValue"] * 10
Upvotes: 0