soosa
soosa

Reputation: 139

Check if multiple columns exist in another columns pandas

I want to run query on multiple columns (Data1, Data2) to find the values in other columns (Column1, Column2, Column3) if it exists in any of these columns (considering these columns have empty cells also)

My data looks like this:

Column1  Column2  Column3  Data1    Data2
sunday   monday            sunday   friday
sunday   friday   tuesday  friday   sunday
sunday            monday   thursday tuesday

I want the output like:

Column1  Column2  Column3  Data1    Exist_Data1   Data2      Exist_Data2
sunday   monday            sunday   Yes           friday     Yes
sunday   friday   tuesday  friday   Yes           wednesday  No
sunday            monday   thursday  No           tuesday    Yes

I have tried the command below:

pd.Series(['A', 'B']).isin(df.columns).all() (as mentioned in the Check if multiple columns exist in a df

but it check in all the columns. I have modified the df.columns but doesn't seem to work either.

*******UPDATE ********

Editing after trying the code:

As can be seen in the image below, there are some entries in Data1 and Data2 that exists in the other columns but still says "No". (There are no whitespace characters - all are removed using strip()" Double checked after writing the CSV and checking for spaces.

enter image description here

Upvotes: 1

Views: 1295

Answers (1)

jezrael
jezrael

Reputation: 862406

You need Series.isin with flatten all values of columns to one long list:

cols = ['Column1','Column2','Column3']
v = np.ravel(df[cols])

df['Exist_Data1'] = np.where(df['Data1'].isin(v),'Yes', 'No')
df['Exist_Data2'] = np.where(df['Data2'].isin(v),'Yes', 'No')

#alternative
df[['Exist_Data1','Exist_Data2']] = np.where(df[['Data1', 'Data2']].isin(v),'Yes', 'No')


print (df)
  Column1 Column2  Column3     Data1      Data2 Exist_Data1 Exist_Data2
0  sunday  monday      NaN    sunday     friday         Yes         Yes
1  sunday  friday  tuesday    friday  wednesday         Yes          No
2  sunday     NaN   monday  thursday    tuesday          No         Yes

Upvotes: 1

Related Questions