move rows of pandas dataframe base on conditions

Question

I want to move every row that has the value in before_id column into the below of the row with same value as that before_id in its node_id column. I've been struggling with it and I already tried .reindex but it was a not good solution for me. So is it possible to change the position of each row ?? or should I do it without pandas ?? Could you anyone give me tips ?? Thank you so much .

client_data = """node_id,option,before_id
    1,A,
    5,A,4
    4,C,1
    6,A,
    2,A,1
    7,C,6
    """

    df = pd.read_csv(io.StringIO(client_data), error_bad_lines=False)
    
    before = pd.notnull(df['before_id'])
    for ind,row in df.iterrows():
        if pd.notnull(row[2]):
            before_id = row[2] # identifying before_id column 
            before_ind = df.loc[df['node_id'] == before_id].index 
            next_ind = before_ind[0] + 1 # the goal index 

            row.reindex(index=next_ind)
        else:
            pass
    df.to_csv('./result3.csv', index=False)

ideal output 

node_id,option,before_id

1,A,
2,A,1
8,C,2
6,A,
7,C,6

Brendan · Accepted Answer

If I understand correctly:

df = pd.DataFrame({'node_id': {0: 1, 1: 5, 2: 4, 3: 6, 4: 2, 5: 7}, 
                   'option': {0: 'A', 1: 'A', 2: 'C', 3: 'A', 4: 'A', 5: 'C'}, 
                   'before_id': {0: np.nan, 1: 4.0, 2: 1.0, 3: np.nan, 4: 1.0, 5: 6.0}})

# sort so 'option' is ordered alphabetically when 'before_id' is same (?)
df.sort_values('option', inplace=True)
# count occurrences of each 'before_id' & add to create counter
df['x'] = .01
df['y'] = df.groupby('before_id')['x'].transform(np.cumsum) + df['before_id']
# create sorting column
df['z'] = df['y'].fillna(df['node_id'])

df.sort_values(['z'], inplace=True)

returns

   node_id option  before_id     x     y     z
0        1      A        NaN  0.01   NaN  1.00
4        2      A        1.0  0.01  1.01  1.01
2        4      C        1.0  0.01  1.02  1.02
1        5      A        4.0  0.01  4.01  4.01
3        6      A        NaN  0.01   NaN  6.00
5        7      C        6.0  0.01  6.01  6.01

and you can then drop x, y,and z.

move rows of pandas dataframe base on conditions

Answers (1)

Related Questions