Reputation: 701
Using Python I'd like to write some code that classifies all items where the cumulative sum of the Miles column <=2.5 as being "IN" and the rest "OUT". Are there any suggestions where to start?
Example Data set
Rank Name Miles
1 A 0.5
2 A 1
3 B 1
4 B 1
5 C 2
Desired Output
Rank Name Miles Assign
1 A 0.5 IN
2 A 1 IN
3 B 1 IN
4 B 1 OUT
5 C 2 OUT
Upvotes: 2
Views: 128
Reputation: 33179
It looks like you're using Pandas, though I'm not an expert.
If you have a dataframe like this:
Rank Name Miles
0 1 A 0.5
1 2 A 1.0
2 3 B 1.0
3 4 B 1.0
4 5 C 2.0
Then you can simply create a new column where the values are based on the cumulative sum of the Miles column:
df['Assign'] = ['IN' if i <= 2.5 else 'OUT' for i in df['Miles'].cumsum()]
Or, I think this is more idiomatic:
df['Assign'] = ['IN' if i else 'OUT' for i in df['Miles'].cumsum() <= 2.5]
Which becomes:
Rank Name Miles Assign
0 1 A 0.5 IN
1 2 A 1.0 IN
2 3 B 1.0 IN
3 4 B 1.0 OUT
4 5 C 2.0 OUT
Upvotes: 1