Reputation: 47
I have a df:
col1 col2 col3 col4 col5
bat cell val val
cat ribo val val
rat dna val val
dog rna val val val
if i am comparing col4 and col5 i want to get the output as:
col1 col2 col3 col4 col5
dog rna val val val
bec col4 has value and col5 has value.
if i compare the col3 and col5 i should get the output as:
col1 col2 col3 col4 col5
bat cell val val
rat dna val val
dog rna val val val
but when i am using the following code:
dfn = df[df['col4'] != df['col5']]
not getting the correct df values.
and i want to add the output to the dataframe as:
col1 col2 col3 col5
dog rna val val
Upvotes: 0
Views: 57
Reputation: 42886
We can write a simple function for this to compare columns and rows which are empty:
Boolean indexing
with notnull
df.replace('', np.NaN, inplace=True)
def compare_cols(dataframe, column1, column2):
return df[df[column1].notnull() & df[column2].notnull()]
print(compare_cols(df, 'col4', 'col5'))
print('\n')
print(compare_cols(df, 'col3', 'col5'))
col1 col2 col3 col4 col5
3 dog rna val val val
col1 col2 col3 col4 col5
0 bat cell val NaN val
2 rat dna val NaN val
3 dog rna val val val
Edit after Jezraels comment. We can use dropna
with subset
which gives the same output:
dropna
def compare_cols2(dataframe, column1, column2):
return df.dropna(subset=[column1, column2])
print(compare_cols2(df, 'col4', 'col5'))
print('\n')
print(compare_cols2(df, 'col3', 'col5'))
col1 col2 col3 col4 col5
3 dog rna val val val
col1 col2 col3 col4 col5
0 bat cell val NaN val
2 rat dna val NaN val
3 dog rna val val val
Note I replaced the whitespaces ('') with NaN
so we can use notnull()
method.
Upvotes: 2
Reputation: 521
#can you try below
df1=df.loc[(df['col4'].notnull() & df['col5'].notnull()),:]]
print(df1)
Upvotes: 0