Reputation: 21
I have the dataframe below. If the value in "Volume" is greater than > 1000, I would like to divide the volume by two and duplicate the entire row. Any advise on how to do this in Python?
*Current state *
s_1, w_1, d_1, w_2,Volume
Source_1, A1, Dest_1, A1, 1800
Source_1, A2, Dest_1, A2, 999
Source_1, A3, Dest_1, A3, 850
Desired outcome s_1, w_1, d_1, w_2,Volume
Source_1, A1, Dest_1, A1, 900
Source_1, A1, Dest_1, A1, 900
Source_1, A2, Dest_1, A2, 999
Source_1, A3, Dest_1, A3, 850
for x in df:
if x df["Volume'] >= 1000:
print(df.loc[["s1","w_1", "d_1", "Volume"/2]] * 2
Upvotes: 2
Views: 47
Reputation: 126
Do you need a one-liner? I would just set aside the data that needs duplication, and set aside the data that doesn't need duplication separately. Then I would apply the division to the subset requiring it, and use pandas.concat to put the data back together. Like this:
import pandas as pd
df = pd.DataFrame({
's1':['Source_1','Source_1','Source_1'],
'w1':['A1','A2','A3'],
'd_1':['Dest_1','Dest_1','Dest_1'],
'w_2':['A1','A2','A3'],
'Volume':[1800,999,850]
})
dupes = df[df['Volume']>=1000].copy() # Subset the data needing duplication and division
remainders = df[df['Volume']<1000].copy() # Hold aside the data not needing changes
dupes['Volume']=dupes['Volume']/2 # Apply the division to Volume
output = pd.concat([dupes,dupes,remainders]).reset_index(drop=True) # Concattenate the dupes(twice) and the unchanged data, also resetting index
print(output)
s1 w1 d_1 w_2 Volume
0 Source_1 A1 Dest_1 A1 900.0
1 Source_1 A1 Dest_1 A1 900.0
2 Source_1 A2 Dest_1 A2 999.0
3 Source_1 A3 Dest_1 A3 850.0
Here I also reset the index afterwards, so that the index of the result isn't duplicated.
Upvotes: 0