Ryan Ball
Ryan Ball

Reputation: 29

Looping over a dataframe and referencing a series

I'm trying to iterate over a data frame in python and in my if statement I reference a couple of columns that happen to be a Series. When i run my code I get the following error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Data:
Taken from solution provided by @CypherX.

template = ['some', 'abra', 'cadabra', 'juju', 'detail page', 'lulu', 'boo', 'honolulu', 'detail page']
prev = ['home', 'abra', 'cacobra', 'juju', 'detail page', 'lulu', 'booboo', 'picabo', 'detail here']
df = pd.DataFrame({'Template': template, 'Prev': prev})
      Template         Prev
0         some         home
1         abra         abra
2      cadabra      cacobra
3         juju         juju
4  detail page  detail page
5         lulu         lulu
6          boo       booboo
7     honolulu       picabo
8  detail page  detail here

My code is the following:

for row in s:
    if (s['Template']=='detail page') and (s['Template']==s['Prev']):
        s['Swipe']=1
    else:
        s['Swipe']=0

where s is my dataframe.

What can I do to fix this? Any ideas?

Upvotes: 0

Views: 70

Answers (4)

Alex
Alex

Reputation: 1126

I think it would be something like this:

s['Swipe'] = (s['Template'] == 'detail page') & (s['Template'] == s['Prev'])

You might convert result from boolean to int then, if you need.

Upvotes: 0

CypherX
CypherX

Reputation: 7353

Since, you did not provide any reproducible problem data, I made my own and here is the solution.

Short Solution

condition = ((df.Template==df.Prev) & (df.Template=='detail page'))
df['Swipe'] = condition.astype(int)

Solution in Detail

Evaluate the condition to a boolean and since you want to assign 1 for True and 0 for False, just a conversion from boolean to int would do the job.

# Prepare Dummy Data
template = ['some', 'abra', 'cadabra', 'juju', 'detail page', 'lulu', 'boo', 'honolulu', 'detail page']
prev = ['home', 'abra', 'cacobra', 'juju', 'detail page', 'lulu', 'booboo', 'picabo', 'detail here']
df = pd.DataFrame({'Template': template, 'Prev': prev})

# Evaluate Condition
condition = ((df.Template==df.Prev) & (df.Template=='detail page'))
df['Swipe'] = condition.astype(int)

print(df)

Output:

      Template         Prev  Swipe
0         some         home      0
1         abra         abra      0
2      cadabra      cacobra      0
3         juju         juju      0
4  detail page  detail page      1
5         lulu         lulu      0
6          boo       booboo      0
7     honolulu       picabo      0
8  detail page  detail here      0

What was the problem in your solution?

  1. Your code iterates over the dataframe s (note: normally s is used for series and df for dataframe), and returns the column names. So the row actually will not return the rows of the dataframe.
  2. Even if you had the row information, you are not using the row anywhere in the code, inside the for loop.
for row in s:
    if (s['Template']=='detail page') and (s['Template']==s['Prev']):
        s['Swipe']=1
    else:
        s['Swipe']=0

I will print out the output with the dataframe df to make my point:

for row in df:
    print(row)

Output:

Template
Prev
Swipe

Upvotes: 0

san
san

Reputation: 1515

2 quick ways I can think of:

  1. Without using numpy
    s['Swipe'].loc[(s['Template']=='detail page') & (s['Template']==s['Prev'])]=1
    s['Swipe'].loc[(s['Template']!='detail page') | (s['Template']!=s['Prev'])]=0
  1. Using numpy (like how one of the above answers have already specified):
    import numpy as np    
    s['Swipe'] = np.where((s['Template'] == 'detail page') & (s['Template'] == s['Prev']), 1, 0)

Upvotes: 0

kbfreder
kbfreder

Reputation: 51

You could try setting the value of s['Swipe'] using np.where instead:

import numpy as np

s['Swipe'] = np.where((s['Template'] == 'detail page') & (s['Template'] == s['Prev']), 1, 0)

Upvotes: 2

Related Questions