EJD
EJD

Reputation: 21

Diving and duplication row in DataFrame above a certain value

I have the dataframe below. If the value in "Volume" is greater than > 1000, I would like to divide the volume by two and duplicate the entire row. Any advise on how to do this in Python?

*Current state *

s_1, w_1, d_1, w_2,Volume

Source_1, A1, Dest_1, A1, 1800

Source_1, A2, Dest_1, A2, 999

Source_1, A3, Dest_1, A3, 850

Desired outcome s_1, w_1, d_1, w_2,Volume

Source_1, A1, Dest_1, A1, 900

Source_1, A1, Dest_1, A1, 900

Source_1, A2, Dest_1, A2, 999

Source_1, A3, Dest_1, A3, 850

for x in df:

   if x df["Volume'] >= 1000:

      print(df.loc[["s1","w_1", "d_1", "Volume"/2]] * 2



  

Upvotes: 2

Views: 47

Answers (1)

Jay Livingston
Jay Livingston

Reputation: 126

Do you need a one-liner? I would just set aside the data that needs duplication, and set aside the data that doesn't need duplication separately. Then I would apply the division to the subset requiring it, and use pandas.concat to put the data back together. Like this:

import pandas as pd

df = pd.DataFrame({
    's1':['Source_1','Source_1','Source_1'],
    'w1':['A1','A2','A3'],
    'd_1':['Dest_1','Dest_1','Dest_1'],
    'w_2':['A1','A2','A3'],
    'Volume':[1800,999,850]
})

dupes = df[df['Volume']>=1000].copy() # Subset the data needing duplication and division
remainders = df[df['Volume']<1000].copy() # Hold aside the data not needing changes
dupes['Volume']=dupes['Volume']/2 # Apply the division to Volume
output = pd.concat([dupes,dupes,remainders]).reset_index(drop=True) # Concattenate the dupes(twice) and the unchanged data, also resetting index

print(output)


         s1  w1     d_1 w_2  Volume
0  Source_1  A1  Dest_1  A1   900.0
1  Source_1  A1  Dest_1  A1   900.0
2  Source_1  A2  Dest_1  A2   999.0
3  Source_1  A3  Dest_1  A3   850.0

Here I also reset the index afterwards, so that the index of the result isn't duplicated.

Upvotes: 0

Related Questions