Pyd
Pyd

Reputation: 6159

wrong result while comparing two columns of a dataframes in python

These are my dataframes.

df contains no values onlly column names,

P1  |P2 |P3



df4,

    Names   Std
0   Kumar   10
1   Ravi    5



mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)

Out[30]:
 0    True
 1    True
Name: Names, dtype: bool

Why it is giving True value when the "P!" column does not have any value in it ?

Upvotes: 4

Views: 445

Answers (1)

jezrael
jezrael

Reputation: 862581

EDIT If need return Falses for empty column, you can add condition for check if column is not empty:

df = pd.DataFrame(columns=['P1','P2','P3'])
print (df)
Empty DataFrame
Columns: [P1, P2, P3]
Index: []

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0    False
1    False
Name: Names, dtype: bool
df = pd.DataFrame({'P1':['Kumar']}, columns=['P1','P2','P3'])
print (df)
      P1   P2   P3
0  Kumar  NaN  NaN

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0     True
1    False
Name: Names, dtype: bool

Upvotes: 1

Related Questions