NewPy
NewPy

Reputation: 663

Dataframe value comparison

I have data-frame like this

enter image description here

I want to compare a with c and b with d. When there is a nan or empty value, it will be considered as 0. enter image description here

I tried to use list comprehension but receive The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

df['bVsd']=["True" if df['b']==df['d'] else "False"] 

Upvotes: 0

Views: 42

Answers (1)

Toukenize
Toukenize

Reputation: 1420

ADDED ANSWER FOR YOUR 2ND QUESTION

To achieve what you want to do, simply compare the columns directly:

import pandas as pd
import numpy as np

df = pd.DataFrame({'a':[1,3,5,7,9],
                   'b':[0,0,0,0,0],
                   'c':[1,3,5,7,9],
                   'd':[0,np.nan,np.nan,0,np.nan]})

# Fill the nan and empty cells with 0
df = df.fillna(0)

# To do the comparison you desire
df['aVsc'] = (df['a'] == df['c'])
df['bVsc'] = (df['b'] == df['d'])

The reason why you are getting the error is because df['b'] == df['d'] returns you a series:

    0
0   True
1   True
2   True
3   True
4   True

and thus it is ambiguous to evaluate the boolean value of a series, unless you specify any or all, which would not be doing you what you want either way.

And lastly, on a separate note, that was not the correct way of doing list comprehension. It should have an iterator and you need to loop over the iterator. Something like this: [True if i == 'something' else False for i in iterator].

2nd Question

If you want df['aVsc] to be 0 when df['a'] == df['c'], and df['aVsc] == df['a'] otherwise, you can use np.where:

df['aVsc'] = np.where(df['a'] == df['c'], 0, df['a'])

in which the np.where function means check if condition df['a'] == df['c'] is True, if it is, assign the value of 0 else, assign the value of df['a'].

Upvotes: 3

Related Questions