Reputation: 21
I am new to python and I faced a hard problem that I could not figure out how to do it. I need to compare values in two columns and then reorder the values based on this comparison. For example, let's assume we have the following dataframe. Now I need to check the values in B if they are smaller than the values in the cumulative sum of A. If a value in B higher than the value in cum of A then we stop at that row. And we reorder the observation based on the numerical value of column B until that row (the row is also included).
For instance, when we look at the values in B we see that all the values are smaller than the values in A until part 4.
Part No | A | B | cum of A |
---|---|---|---|
1 | 2 | 13 | 2 |
2 | 4 | 17 | 6 |
3 | 7 | 15 | 13 |
4 | 5 | 16 | 18 |
5 | 10 | 19 | 28 |
6 | 9 | 16 | 37 |
7 | 8 | 12 | 45 |
So that I need to reorder all values based on numerical values in B until part 4 (part 4 is also included). Then the new table should be
Part No | A | B | cum of A |
---|---|---|---|
1 | 2 | 13 | 2 |
3 | 7 | 15 | 9 |
4 | 5 | 16 | 14 |
2 | 4 | 17 | 18 |
5 | 10 | 19 | 28 |
6 | 9 | 16 | 37 |
7 | 8 | 12 | 45 |
And if there is still a higher value in cum of A than the values in B among these reordered observations (first four parts), then part 4 will be taken out from the list and put in a separate list.
This process should be repeated until all parts will be ordered.
Could anyone help me to get the code for this part?
Upvotes: 2
Views: 96
Reputation: 128
I know it is a little rough but if you want, you can use below:
#order the values in list
def order(a):
for i in range(len(a)):
# print(i)
b = a[i-1] - a[i]
if b>0:
temp = a[i-1]
a[i-1] = a[i]
a[i] = temp
for j in range(i,0,-1):
b = a[j-1] - a[j]
if b>0:
temp = a[j-1]
a[j-1] = a[j]
a[j] = temp
return a
b = [13,15,16,17,19,16,12]
a = [2, 7, 5, 4, 10, 9, 8]
#other list to append the last value if cum is greater than value of "b" after ordering b
otherList = []
# isRecursive and indexToTake is for:
# "And if there is still a higher value in cum of A than the values in B
# among these reordered observations (first four parts), then part 4 will be
# taken out from the list and put in a separate list."
def solve(isRecursive, indexToTake):
c = 0
for i in range(len(a)):
c += a[i] #cum sum
if c> b[i]: #check if cum is greater than the value of "b" at that index
if isRecursive:
if i >indexToTake:
return
otherList.append(b[indexToTake])
return
b[0:i+1] = order(b[0:i+1])
solve(True, i) #check if the values at "b" are greater than cum after ordered until the indexToTake
break
solve(False, 0) #start solving
Upvotes: 1
Reputation: 9619
This should work:
import pandas as pd
data = [ { "Part No": 1, "A": 2, "B": 13, "cum of A": 2 }, { "Part No": 2, "A": 4, "B": 17, "cum of A": 6 }, { "Part No": 3, "A": 7, "B": 15, "cum of A": 13 }, { "Part No": 4, "A": 5, "B": 16, "cum of A": 18 }, { "Part No": 5, "A": 10, "B": 19, "cum of A": 28 }, { "Part No": 6, "A": 9, "B": 16, "cum of A": 37 }, { "Part No": 7, "A": 8, "B": 12, "cum of A": 45 } ]
df = pd.DataFrame(data)
smaller_indexes = len(df[df['B'] > df['cum of A']]) + 1 # check where B is larger than 'cum of A', take the length of that part of the dataframe and add 1 to get the index number of the last value to use for sorting
df[:smaller_indexes] = df[:smaller_indexes].sort_values(by=['B']) # slice the dataframe by the found index number, and sort only that slice of the dataframe
df['cum of A'] = df['A'].cumsum() #recount the cumsum of the entire dataframe
This will output:
Part No | A | B | cum of A | |
---|---|---|---|---|
0 | 1 | 2 | 13 | 2 |
1 | 3 | 7 | 15 | 9 |
2 | 4 | 5 | 16 | 14 |
3 | 2 | 4 | 17 | 18 |
4 | 5 | 10 | 19 | 28 |
5 | 6 | 9 | 16 | 37 |
6 | 7 | 8 | 12 | 45 |
Upvotes: 1