Reputation: 222
I have numeric data within Student marks and I would like to group them into 3 categories A, B and C.
df = pd.DataFrame([('Adel', 3.5),
('Betty', 2.75),
('Djamel', 2.10),
('Ramzi', 1.75),
('Alexa', 3.15)],
columns=('Name', 'GPA'))
I tried function pd.cut()
but it didn't lead to wanted result .
Upvotes: 0
Views: 418
Reputation: 21
In a recent research, a PSO was implemented to classify students under unknown number of groups. PSO showed improved capabilities compared to GA. I think that all you need is the specific research.
The paper is: Forming automatic groups of learners using particle swarm optimization for applications of differentiated instruction
You can find the paper here: https://doi.org/10.1002/cae.22191
Perhaps the researchers could guide you through researchgate: https://www.researchgate.net/publication/338078753
You just need to remove the technic from automatic number of groups
Upvotes: 1
Reputation: 222
I found this solution :
import pandas as pd, numpy as np
df = pd.DataFrame({'GPA': [99, 53, 71, 84, 84],
'Name': ['Betty', 'Djamel', 'Ramzi', 'Alexa', 'Adel']})
bins = [0, 50, 60, 70, 80, 100]
names = ['F', 'D', 'C', 'B', "A"]
d = dict(enumerate(names, 1))
df['Rank'] = np.vectorize(d.get)(np.digitize(df['GPA'], bins))
thanks to this link here.
Upvotes: 0
Reputation: 21749
Here's a way using pd.cut
:
df = df.sort_values('GPA')
df['bins'] = pd.cut(df['GPA'], bins=3, labels = ['A','B','C'])
Name GPA bins
3 Ramzi 1.75 A
2 Djamel 2.10 A
1 Betty 2.75 B
4 Alexa 3.15 C
0 Adel 3.50 C
Upvotes: 1