Reputation: 23

Python IF IN statement with list

Could I get a little help on what I'm sure's a simple solution. I've looked around here and online and not been able to solve it. I'm less than a week old in to this and have basically set myself a little task to learn some of the basics.

import pandas as pd

df1 = pd.read_csv('booking.csv', names=['snum','booked','name'])
df1.drop(['booked', 'name'], axis=1, inplace=True)

df2 = df1.values.tolist()

print('The following tickets are available; %s' % df2)
tic = input('Which ticket would you like to buy? ')

if tic in df2:
    print('Ok')
else:
    print('Ticket Unavilable')

The problem I'm having is with the If statement. No matter what value I use as the input I always get the message 'Ticket Unavailable'. I belive the error must lie with the list that has been coverted from the dataframe.

So far I've;

Tested the IF statement aggainst a List that hasn't been converted or imported and that worked as expected
'Printed' the df2 variable type to confirm it is a list
The df2 variables appear in the printed question so I know they've been imported and converted across ok
Copy and pasted in to a different Python file with the same result

The variables are basic seat numbers A1, A2, A3, A4, A5, B1, B2, B3, B4, B5. I'm aware an input of 'A' 'B' ect would also return 'OK', the practicality isn't as important as functionality.

Upvotes: 2

Answers (4)

Eli Korvigo

Reputation: 10513

It looks like pandas.DataFrame.values.tolist does not produce what you think it does:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame.from_records([dict(a=1), dict(a=2)])

In [3]: df
Out[3]: 
   a
0  1
1  2

In [4]: df.values.tolist()
Out[4]: [[1], [2]]

In your case tic is a string, but df.values.tolist() is a nested list. I guess, what you want is:

df2 = set(df1['snum'])

I've used a set, because hash-tables are more suited for lookup.

Upvotes: 2

Pierre

Reputation: 1099

It looks like you are not using the right DataFrame method to get the list of tickets. Use DataFrame.to_dict('records') to simplify extraction of the ticket names.

booking.csv

92747,true,Harry Potter
28479,false,Sherlock Holmes

Python code

import pandas as pd

# Load ticket list from CSV file
booking_df = pd.read_csv('booking.csv', names=['snum','booked','name'])

# Convert ticket list to a list of built-in Python dictionaries
ticket_list = booking_df.to_dict('records')

# Extract the ticket names from the list of tickets
ticket_name_set = {ticket["name"] for ticket in ticket_list}

print('The following tickets are available; %s' % ticket_name_set)
wanted_ticket_name = input('Which ticket would you like to buy? ')

if wanted_ticket_name in ticket_name_set:
    print('Ok')
else:
    print('Ticket Unavilable')

Output:

➜ python tickets.py
The following tickets are available; ['Harry Potter', 'Sherlock Holmes']
Which ticket would you like to buy? Harry Potter
Ok
➜ python tickets.py
The following tickets are available; ['Harry Potter', 'Sherlock Holmes']
Which ticket would you like to buy? Hamlet
Ticket Unavilable

Upvotes: 0

wsa1982

Reputation: 1

Change

if tic in df2:

if any(tic in s for s in df2):

You have a list, however you are attempting to access it like a Panda Dataframe.

booking.csv

snum    booked  name
a1      no  
a2      no  
a3      no  
a4      no  
a5      no  
a6      no  
a7      no  
a8      no  
a9      no  
a10     no  
a11     no
a12     no
a13     no
b1      no
b2      no
b3      no
b4      no
b5      no
b6      no
b7      no
b8      no
b9      no
b10     no
b11     no
b12     no
b13     no

Your example with modification

import pandas as pd

df1 = pd.read_csv('booking.csv', names=['snum','booked','name'])
df1.drop(['booked', 'name'], axis=1, inplace=True)

df2 = df1.values.tolist()

print('The following tickets are available; %s' % df2)
tic = input('Which ticket would you like to buy? ')

if any(tic in s for s in df2):#df2.str.contains(tic):
    print('Ok')
else:
    print('Ticket Unavilable')

Gives the following output

The following tickets are available; [['snum'], ['a1'], ['a2'], ['a3'], ['a4'], ['a5'], ['a6'], ['a7'], ['a8'], ['a9'], ['a10'], ['a11'], ['a12'], ['a13'], ['b1'], ['b2'], ['b3'], ['b4'], ['b5'], ['b6'], ['b7'], ['b8'], ['b9'], ['b10'], ['b11'], ['b12'], ['b13']]

Which ticket would you like to buy? a1
Ok

The following tickets are available; [['snum'], ['a1'], ['a2'], ['a3'], ['a4'], ['a5'], ['a6'], ['a7'], ['a8'], ['a9'], ['a10'], ['a11'], ['a12'], ['a13'], ['b1'], ['b2'], ['b3'], ['b4'], ['b5'], ['b6'], ['b7'], ['b8'], ['b9'], ['b10'], ['b11'], ['b12'], ['b13']]

Which ticket would you like to buy? c2
Ticket Unavilable

Upvotes: 0

jpp

Reputation: 164773

pd.DataFrame.values.tolist gives a nested list.

pd.Series.values.tolist gives a non-nested list, assuming your series elements are not themselves lists.

To understand what's happening here, you need to appreciate that NumPy arrays are used internally by Pandas. The values attribute of pd.DataFrame and pd.Series objects extracts corresponding NumPy array. For a dataframe, this will always be 2-dimensional, even if your dataframe has a single series.

The allowance for nested lists is clear in the NumPy docs:

ndarray.tolist()

Return the array as a (possibly nested) list.

Your have a couple of options:

pd.Series.values

For an isolated membership check in a series, you can use pd.Series.values:

vals = df1['snum'].values

if tic in vals:
    # do something

set

This uses O(1) lookup and is recommended if you will be repeatedly checking for membership in a single series:

snum_set = set(df1['snum'])

if tic in snum_set:
    # do something

Upvotes: 0

Python IF IN statement with list

Answers (4)

pd.Series.values

set

Related Questions