reeeeeeeeeeee
reeeeeeeeeeee

Reputation: 139

How to compare multiple rows in a DataFrame and edit certain values on ID match

Say I have a df:

df = pd.DataFrame({'src':['LV','LA','NC','NY','ABC','XYZ'], 'dest':['NC','NY','LV','LA','XYZ','ABC'], 'dummy':[1,3,6,7,8,10], 'pair':[1,2,1,2,3,3]})

src   dest   dummy  pair
LV      NC       1     1
LA      NY       3     2
NC      LV       6     1
NY      LA       7     2
ABC     XYZ      8     3
XYZ     ABC     10     3

Where the 'pair' column is an id for matching src/dest combos like (a->b, b->a).

Assume there will only be 2 of the same ID per pair since there is only one path forward and one path back (so 1,1 ... 2,2 ... etc)

How can I compare the 'dummy' values between pairs and if the dummy value from one row is greater than the other row of the same pair, 'g' gets appended to the 'dest' value and 'g' would then get appended to 'src' value in the row with the lesser dummy value to arrive at this.

src    dest   dummy  pair
LVg      NC       1     1
LAg      NY       3     2
NC      LVg       6     1
NY      LAg       7     2
ABCg     XYZ      8     3
XYZ     ABCg     10     3

For example in pair '2' the second row has a higher dummy value of 7 so the dest gets 'g' appended and therefore the lower dummy value row gets 'g' appended to src to mirror the change.

Upvotes: 1

Views: 191

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

You can try something like this:

df = pd.DataFrame({'src':['LV','LA','NC','NY','ABC','XYZ'], 'dest':['NC','NY','LV','LA','XYZ','ABC'], 'dummy':[1,3,6,7,8,10], 'pair':[1,2,1,2,3,3]})

def f(x):
    x.iloc[0, x.columns.get_loc('src')] += 'g'
    x.iloc[1, x.columns.get_loc('dest')] += 'g'
    return x

df.sort_values('dummy').groupby('pair').apply(f)

Output:

    src  dest  dummy  pair
0   LVg    NC      1     1
1   LAg    NY      3     2
2    NC   LVg      6     1
3    NY   LAg      7     2
4  ABCg   XYZ      8     3
5   XYZ  ABCg     10     3

Details:

With a presorted dataframe, we take groups of record (2 records in this case) and pass them to the helper function, f, as a dataframe with apply. In f, using integer location, iloc append 'g' to the first record and to the 'src' determine by position using get_loc from column header of x and also append 'g' to the second record and the column that match 'dest' then return that update dataframe.

Upvotes: 1

Related Questions