Reputation: 706
I have data grabbed to Python that is in table form:
Name Sport Score
John Golf 100
Jill Rugby 55
John Hockey 100
Bob Golf 45
How can I format this table in Python that would make it easy to sort or group items. For example, if I wanted to see all the names of people that played Golf or all of the people that scored 100 on any sport. Or all of the data for just John.
Upvotes: 0
Views: 1134
Reputation: 11164
What about this one?
yourDS={"name":["John","Jill","John","Bob"],
"sport":["Golf","Rugby","Hockey","Golf"],
"score":[100,55,100,45]
}
This should hold the relation of each entry as list are ordered.
To avoid the effect of duplicate element in a list, first make a new set
from the list.
For your expected query, you can do something like that.
for index,value in enumerate(yourDS["score"]):
if value=="x":
print yourDS["name"][index]
It's better to use a list
to store the result and make it a set
, to avoid some cases for example, if a man has score of x
in two different games.
Upvotes: 1
Reputation: 1296
map
and filter
with namedtuple
s and lambda
s can be used for this task.
from collections import namedtuple
# Create a named tuple to store the rows
Row = namedtuple('Row', ('name', 'sport', 'score'))
data = '''Name Sport Score
John Golf 100
Jill Rugby 55
John Hockey 100
Bob Golf 45'''
# Read the data, skip the first line
lines = data.splitlines()[1:]
rows = []
for line in lines:
name, sport, score = line.strip().split()
rows.append(Row(name, sport, int(score)))
# People that played Golf
golf_filter = lambda row: row.sport == 'Golf'
golf_players = filter(golf_filter, rows)
# People that scored 100 on any sport
score_filter = lambda row: row.score == 100
scorers = filter(score_filter, rows)
# People named John
john_filter = lambda row: row.name == 'John'
john_data = filter(john_filter, rows)
# If you want a specific column than you can map the data
# Names of golf players
get_name = lambda row: row.name
golf_players_names = map(get_name, golf_players)
Results:
>>> golf_players
[Row(name='John', sport='Golf', score=100),
Row(name='Bob', sport='Golf', score=45)]
>>> john_data
[Row(name='John', sport='Golf', score=100),
Row(name='John', sport='Hockey', score=100)]
>>> scorers
[Row(name='John', sport='Golf', score=100),
Row(name='John', sport='Hockey', score=100)]
>>> golf_players_names
['John', 'Bob']
Upvotes: 1
Reputation: 2123
If you want to retrieve information based on your data, I'd go with SQL. It's well-suited to answering questions like these:
...to see all the names of people that played Golf...
...all of the people that scored 100 on any sport...
...all of the data for just John.
The most popular database language these days is SQL, and as it happens Python actually has built-in support for it through the sqlite3
module.
SQL, while not a monumental task to learn, is beyond the scope of this answer. To learn that, I'd recommend checking out Codecademy, Code School, or SQLZOO (they're all interactive).
Or, if you just want to read it in and write it out without caring about what it actually means, consider using the csv
module, which is also built-in.
Upvotes: -1
Reputation: 81684
pandas
' DataFrame
will be the way to go:
import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Jill', 'John', 'Bob'],
'Sport' : ['Golf', 'Rugby', 'Hockey', 'Golf'],
'Score': [100, 50, 100, 45]})
# the names of people that played Golf
df[df['Sport'] == 'Golf']['Name'].unique()
>> ['John' 'Bob']
# all of the people that scored 100 on any sport.
df[df['Score'] == 100]['Name'].unique()
>> ['John']
# all of the data for just John.
df[df['Name'] == 'John']
>> Name Score Sport
0 John 100 Golf
2 John 100 Hockey
Upvotes: 1
Reputation: 72
You can create list of lists. each row will be a list inside a list.
lst1=[['John','Golf',100],['Jill','Rugby',55],['John','Hockey',100],['Bob','Golf',45]]
lst100=[]
for lst in lst1:
if lst[2]==100:
lst100.append(lst)
print lst100
Upvotes: -1