Reputation: 23
Could I get a little help on what I'm sure's a simple solution. I've looked around here and online and not been able to solve it. I'm less than a week old in to this and have basically set myself a little task to learn some of the basics.
import pandas as pd
df1 = pd.read_csv('booking.csv', names=['snum','booked','name'])
df1.drop(['booked', 'name'], axis=1, inplace=True)
df2 = df1.values.tolist()
print('The following tickets are available; %s' % df2)
tic = input('Which ticket would you like to buy? ')
if tic in df2:
print('Ok')
else:
print('Ticket Unavilable')
The problem I'm having is with the If statement. No matter what value I use as the input I always get the message 'Ticket Unavailable'. I belive the error must lie with the list that has been coverted from the dataframe.
So far I've;
The variables are basic seat numbers A1, A2, A3, A4, A5, B1, B2, B3, B4, B5. I'm aware an input of 'A' 'B' ect would also return 'OK', the practicality isn't as important as functionality.
Upvotes: 2
Views: 679
Reputation: 10513
It looks like pandas.DataFrame.values.tolist
does not produce what you think it does:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame.from_records([dict(a=1), dict(a=2)])
In [3]: df
Out[3]:
a
0 1
1 2
In [4]: df.values.tolist()
Out[4]: [[1], [2]]
In your case tic
is a string, but df.values.tolist()
is a nested list. I guess, what you want is:
df2 = set(df1['snum'])
I've used a set, because hash-tables are more suited for lookup.
Upvotes: 2
Reputation: 1099
It looks like you are not using the right DataFrame
method to get the list of tickets. Use DataFrame.to_dict('records')
to simplify extraction of the ticket names.
booking.csv
92747,true,Harry Potter
28479,false,Sherlock Holmes
Python code
import pandas as pd
# Load ticket list from CSV file
booking_df = pd.read_csv('booking.csv', names=['snum','booked','name'])
# Convert ticket list to a list of built-in Python dictionaries
ticket_list = booking_df.to_dict('records')
# Extract the ticket names from the list of tickets
ticket_name_set = {ticket["name"] for ticket in ticket_list}
print('The following tickets are available; %s' % ticket_name_set)
wanted_ticket_name = input('Which ticket would you like to buy? ')
if wanted_ticket_name in ticket_name_set:
print('Ok')
else:
print('Ticket Unavilable')
Output:
➜ python tickets.py
The following tickets are available; ['Harry Potter', 'Sherlock Holmes']
Which ticket would you like to buy? Harry Potter
Ok
➜ python tickets.py
The following tickets are available; ['Harry Potter', 'Sherlock Holmes']
Which ticket would you like to buy? Hamlet
Ticket Unavilable
Upvotes: 0
Reputation: 1
Change
if tic in df2:
to
if any(tic in s for s in df2):
You have a list, however you are attempting to access it like a Panda Dataframe.
booking.csv
snum booked name
a1 no
a2 no
a3 no
a4 no
a5 no
a6 no
a7 no
a8 no
a9 no
a10 no
a11 no
a12 no
a13 no
b1 no
b2 no
b3 no
b4 no
b5 no
b6 no
b7 no
b8 no
b9 no
b10 no
b11 no
b12 no
b13 no
Your example with modification
import pandas as pd
df1 = pd.read_csv('booking.csv', names=['snum','booked','name'])
df1.drop(['booked', 'name'], axis=1, inplace=True)
df2 = df1.values.tolist()
print('The following tickets are available; %s' % df2)
tic = input('Which ticket would you like to buy? ')
if any(tic in s for s in df2):#df2.str.contains(tic):
print('Ok')
else:
print('Ticket Unavilable')
Gives the following output
The following tickets are available; [['snum'], ['a1'], ['a2'], ['a3'], ['a4'], ['a5'], ['a6'], ['a7'], ['a8'], ['a9'], ['a10'], ['a11'], ['a12'], ['a13'], ['b1'], ['b2'], ['b3'], ['b4'], ['b5'], ['b6'], ['b7'], ['b8'], ['b9'], ['b10'], ['b11'], ['b12'], ['b13']]
Which ticket would you like to buy? a1
Ok
or
The following tickets are available; [['snum'], ['a1'], ['a2'], ['a3'], ['a4'], ['a5'], ['a6'], ['a7'], ['a8'], ['a9'], ['a10'], ['a11'], ['a12'], ['a13'], ['b1'], ['b2'], ['b3'], ['b4'], ['b5'], ['b6'], ['b7'], ['b8'], ['b9'], ['b10'], ['b11'], ['b12'], ['b13']]
Which ticket would you like to buy? c2
Ticket Unavilable
Upvotes: 0
Reputation: 164773
pd.DataFrame.values.tolist
gives a nested list.
pd.Series.values.tolist
gives a non-nested list, assuming your series elements are not themselves lists.
To understand what's happening here, you need to appreciate that NumPy arrays are used internally by Pandas. The values
attribute of pd.DataFrame
and pd.Series
objects extracts corresponding NumPy array. For a dataframe, this will always be 2-dimensional, even if your dataframe has a single series.
The allowance for nested lists is clear in the NumPy docs:
ndarray.tolist()
Return the array as a (possibly nested) list.
Your have a couple of options:
For an isolated membership check in a series, you can use pd.Series.values
:
vals = df1['snum'].values
if tic in vals:
# do something
This uses O(1) lookup and is recommended if you will be repeatedly checking for membership in a single series:
snum_set = set(df1['snum'])
if tic in snum_set:
# do something
Upvotes: 0