Reputation: 103
I haven't been able to figure this out.
Let's say I have a pandas dataframe (port_info
) that looks like this:
chass olt port BW
0 1 1 1 80000
1 1 1 2 212000
2 1 1 3 926600
3 1 1 4 50000
4 1 1 5 170000
5 1 1 6 840000
6 1 1 7 320000
7 1 1 8 500000
8 1 1 9 270000
9 1 1 10 100000
10 1 2 1 420000
11 1 2 2 60000
12 1 2 3 480000
13 1 2 4 90000
14 1 2 5 0
15 1 2 6 520000
16 1 2 7 840000
17 1 2 8 900000
18 1 2 9 110000
19 1 2 10 0
I want to add a column depending on how many ports per olt per chassis. If there are more than 8 ports per olt per chass, then add a value of 1 to every row for that olt for that chass. Otherwise, add a value of 10 to every row for that olt for that chass.
In the end, I need a new column (port_info.BW_cap
) that has a value for each port dependent on how many ports there are in that olt in that chass.
So far I have this to check the max port per olt:
test = pd.DataFrame(table.groupby(['chass','olt'])['port'].max()).reset_index()
That gets me a minimalist dataframe that looks like this:
chass olt
1 1 10
2 10
3 10
4 10
5 10
6 10
7 10
8 10
11 10
12 10
13 10
14 10
15 10
16 10
17 10
18 10
What's the best way to take all of the above and basically have pandas iterate through every row in the initial dataframe and compare to the appropriate row in the minimalist dataframe to check what the max port is for that olt for that chassis, and add a value to the row in the initial dataframe under a new column named 'BW_cap' dependant on the value in the minimalist dataframe for that same chass/olt combo?
So in the end, something that looks like this:
chass olt port BW BW_cap
0 1 1 1 80000 1
1 1 1 2 212000 1
2 1 1 3 926600 1
3 1 1 4 50000 1
4 1 1 5 170000 1
Upvotes: 0
Views: 93
Reputation: 3722
I think I get what you want. You just need the bottom 3 lines in this code. You were close, you can just join your groupby max result to the original dataframe.
One thing to note, saying "if there are more than 8 ports per chass/olt combination" is different than saying "the max port is > 8". If your ports aren't always number ascending 1 to 10. if there are chass/olt combinations that have 3, 6, 9 as the 3 ports, thats only 3 ports but the max is 9.
import random
random.seed(123)
df = pd.DataFrame({
'chass':[random.randint(1, 10) for x in range(200)],
'olt':[random.randint(1, 10) for x in range(200)],
'port':[random.randint(1, 10) for x in range(200)],
'BW':[random.randint(0, 1000000) for x in range(200)]})
g = df.groupby(['chass', 'olt']).apply(lambda x: 1 if x.port.max() > 8 else 10).reset_index()
g.columns = ['chass', 'olt', 'BW_cap']
df = pd.merge(df, g, on=['chass', 'olt'])
Upvotes: 1