Reputation: 1
I have a file that looks like this:
LastName FirstName Age Gender Height Weight
Smith May 20 F 1500 55
Wilder Harry 25 M 1800 65
Potter Harry 50 M 1600 66
Lincoln Abram 100 M 1800 55
Reynolds Mary 55 F 1600 55
Anderson Jane 40 F 1700 60
Smith William 42 M 1520 60
I want to be able to search in memory for example to find who has a height of 1800, or who has a last name of Smith, without having to read the file again.
I can read the file using import csv
filename = r'C:\Users\wsteve46\Documents\Python\People.csv'
reader = csv.DictReader(open(filename))
results = []
resdict = []
for row in reader:
try:
print 'Row = ',row
results.append(row.values())
resdict.append(row)
except:
break
print 'break ',row
fieldnames = row.keys()
However, resdict is a list, not a dictionary. What is the best way to access this data by key/value?
Upvotes: 0
Views: 967
Reputation: 59604
Another option is to use Sqlite3 with in memory database:
import sqlite3
con = sqlite3.connect(':memory:')
cur = con.cursor()
cur.execute('INSERT ...')
con.commit()
cur.execute('SELECT ... WHERE ...')
rows = cur.fetchall()
for row in rows:
print(row)
This gives you a wide range of SQL functions to use without dependencies.
Upvotes: 1
Reputation: 50970
While you could put the results of the CSV read into a dictionary that will not directly address your problem since a dictionary is keyed on only one element and you mention that you might want to search by different elements.
An alternative is to create one dictionary per type of search you plan on executing. This is effectively the same as creating an in-memory index for each type of search. The trick is that the payload on each dictionary key will need to be a list of people, not a single instance of a person.
However, if your total number of people is of a "reasonable" size, you can search a list and return a subset of the list using a list comprehension. Here are two, one for each search you mention in your query; it's easy to construct others:
smiths = [p for p in resdict if p.LastName == 'Smith']
height_1800s = [p for p in resdict if p.Height = 1800]
Upvotes: 0
Reputation: 9946
the easiest way for this is using pandas
import pandas as pd
data = pd.read_csv(fn)
print data[data.Height == 1800]
print data[data.LastName == 'Smith']
you'll have to do more research on your own, but that answers your first question.
Upvotes: 3