Python Pandas: Comparison of elements in Dataframe/series

Question

I have a DataFrame in a variable called "myDataFrame" that looks like this:

+---------+-----+-------+-----
| Type    | Count  |  Status |
+---------+-----+-------+-----
| a       |  70    |     0   |
| a       |  70    |     0   |
| b       |  70    |     0   |
| c       |  74    |     3   |
| c       |  74    |     2   |
| c       |  74    |     0   |
+---------+-----+-------+----+

I am using vectorized approach to process the rows in this DataFrame since the amount of rows I have is about 116 million.

So I wrote something like this:

myDataFrame['result'] = processDataFrame(myDataFrame['status'], myDataFrame['Count'])

In my function, I am trying to do this:

def processDataFrame(status, count):
    resultsList = list()
    if status == 0:
       resultsList.append(count + 10000)
    else:
       resultsList.append(count - 10000)

    return resultsList

But I get this for comparison status values:

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

What am i missing?

BENY · Accepted Answer

We can do without self-def function

myDataFrame['result'] = np.where(myDataFrame['status']==0,
                                 myDataFrame['Count']+10000,
                                 myDataFrame['Count']-10000)

Update

df.apply(lambda x : processDataFrame(x['Status'],x['Count']),1)
0    [10070]
1    [10070]
2    [10070]
3    [-9926]
4    [-9926]
5    [10074]
dtype: object

Python Pandas: Comparison of elements in Dataframe/series

Answers (2)

Related Questions