Reputation: 1023
I am having trouble trying to find a simple way to rank the product's values grouped by Date and Product. For example, in June 2020, we have 6 products, they have different values, I want to rank them according to their market share out of the total, in any case, this is equivalent to ranking them according to their values.
I attempted to use this link's answer but it is not giving the correct input.
I could jolly well turn this df into a 2d list and sort using a for loop, but I would say since I am using pandas, I will try my best to stick with the functionalities like .rank()
.
df = pd.DataFrame()
df['Market'] = ['usa'] *2 + ['russia']*2 + ['china'] * 8
df["Product"] = ["A", "B", "C", "D", "E", "F"] * 2
df["Date"] = ["1/6/2020"] * 6 + ["1/9/2020"] * 6
df["Date"] = df["Date"].astype("datetime64[ns]")
df["Value"] = [100, 200, 300, 400, 500, 600, 1000, 2000, 800, 900, 999, 10000]
Product Date Value Expected Ranking
0 A 1/6/2020 100 1
1 B 1/6/2020 200 2
2 C 1/6/2020 300 3
3 D 1/6/2020 400 4
4 E 1/6/2020 500 5
5 F 1/6/2020 600 6
6 A 1/9/2020 1000 3
7 B 1/9/2020 2000 2
8 C 1/9/2020 800 6
9 D 1/9/2020 900 5
10 E 1/9/2020 999 4
11 F 1/9/2020 10000 1
Upvotes: 1
Views: 238
Reputation: 323376
Try pass dense
df.groupby('Date')['Value'].rank(method='dense')
Out[224]:
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 6.0
6 6.0
7 5.0
8 4.0
9 3.0
10 2.0
11 1.0
Name: Value, dtype: float64
Upvotes: 3