jamo
jamo

Reputation: 91

Replace values in one df based on specific indexes in another df

I have a data frame like below:

df1:
ix a  b   c  d  e 
s1 AA AG AG GG AA
s2 DI DD GG GG GG
S3 TT CC TC TT TC
S4 AA AA AA AA AA
S5 CC CC DD CC CC

and another dataframe like this:

df2
ix col1 col2 
s1 2     A
s3 4     G
S4 7     A

I want to replace all values in the row to "XX" in df1 for the indexes (s1,s4) which are "A" in col2 of df2.

so the final dataframe is like this:

df1:
ix a  b   c  d  e 
s1 XX XX XX XX XX
s2 DI DD GG GG GG
S3 TT CC TC TT TC
S4 XX XX XX XX XX
S5 CC CC DD CC CC

Any suggestions is appreciated. Thank you in advance.

df = pd.DataFrame({'ix': {0: 's1', 1: 's2', 2: 'S3', 3: 'S4', 4: 'S5'}, 
                   'a': {0: 'AA', 1: 'AG', 2: 'AG', 3: 'GG', 4: 'AA'}, 
                   'b': {0: 'DI', 1: 'DD', 2: 'GG', 3: 'GG', 4: 'GG'}, 
                   'c': {0: 'TT', 1: 'CC', 2: 'TC', 3: 'TT', 4: 'TC'}, 
                   'd': {0: 'AA', 1: 'AA', 2: 'AA', 3: 'AA', 4: 'AA'}, 
                   'e': {0: 'CC', 1: 'DD', 2: 'CC', 3: 'CC', 4: 'CC'}})

Upvotes: 1

Views: 46

Answers (1)

jezrael
jezrael

Reputation: 862551

Get index values of df2 if col2 is A and set rows matched with XX in df1:

df1.loc[df1.index.isin(df2.index[df2['col2'].eq('A')])] = 'XX'
print (df1)
     a   b   c   d   e
ix                    
s1  XX  XX  XX  XX  XX
s2  AG  DD  CC  AA  DD
S3  AG  GG  TC  AA  CC
S4  XX  XX  XX  XX  XX
S5  AA  GG  TC  AA  CC

Setup:

df1 = pd.DataFrame({'ix': {0: 's1', 1: 's2', 2: 'S3', 3: 'S4', 4: 'S5'}, 
                   'a': {0: 'AA', 1: 'AG', 2: 'AG', 3: 'GG', 4: 'AA'}, 
                   'b': {0: 'DI', 1: 'DD', 2: 'GG', 3: 'GG', 4: 'GG'}, 
                   'c': {0: 'TT', 1: 'CC', 2: 'TC', 3: 'TT', 4: 'TC'}, 
                   'd': {0: 'AA', 1: 'AA', 2: 'AA', 3: 'AA', 4: 'AA'}, 
                   'e': {0: 'CC', 1: 'DD', 2: 'CC', 3: 'CC', 4: 'CC'}}).set_index('ix')

print (df1)
     a   b   c   d   e
ix                    
s1  AA  DI  TT  AA  CC
s2  AG  DD  CC  AA  DD
S3  AG  GG  TC  AA  CC
S4  GG  GG  TT  AA  CC
S5  AA  GG  TC  AA  CC

print (df2)
    col1 col2
ix           
s1     2    A
s3     4    G
S4     7    A

Upvotes: 2

Related Questions