WillliamS
WillliamS

Reputation: 1

Python dictionary / database in memory

I have a file that looks like this:

LastName    FirstName   Age Gender  Height  Weight
Smith   May 20  F   1500    55
Wilder  Harry   25  M   1800    65
Potter  Harry   50  M   1600    66
Lincoln Abram   100 M   1800    55
Reynolds    Mary    55  F   1600    55
Anderson    Jane    40  F   1700    60
Smith   William 42  M   1520    60

I want to be able to search in memory for example to find who has a height of 1800, or who has a last name of Smith, without having to read the file again.

I can read the file using import csv

filename = r'C:\Users\wsteve46\Documents\Python\People.csv'
reader = csv.DictReader(open(filename))

results = []
resdict = []

for row in reader:
    try:
        print 'Row = ',row
        results.append(row.values())
        resdict.append(row)

    except:
         break
         print 'break ',row
fieldnames = row.keys()

However, resdict is a list, not a dictionary. What is the best way to access this data by key/value?

Upvotes: 0

Views: 967

Answers (3)

warvariuc
warvariuc

Reputation: 59604

Another option is to use Sqlite3 with in memory database:

import sqlite3

con = sqlite3.connect(':memory:')
    
cur = con.cursor()    
cur.execute('INSERT ...')
con.commit()

cur.execute('SELECT ... WHERE ...')
rows = cur.fetchall()
for row in rows:
    print(row)

This gives you a wide range of SQL functions to use without dependencies.

Upvotes: 1

Larry Lustig
Larry Lustig

Reputation: 50970

While you could put the results of the CSV read into a dictionary that will not directly address your problem since a dictionary is keyed on only one element and you mention that you might want to search by different elements.

An alternative is to create one dictionary per type of search you plan on executing. This is effectively the same as creating an in-memory index for each type of search. The trick is that the payload on each dictionary key will need to be a list of people, not a single instance of a person.

However, if your total number of people is of a "reasonable" size, you can search a list and return a subset of the list using a list comprehension. Here are two, one for each search you mention in your query; it's easy to construct others:

smiths = [p for p in resdict if p.LastName == 'Smith']

height_1800s = [p for p in resdict if p.Height = 1800]

Upvotes: 0

acushner
acushner

Reputation: 9946

the easiest way for this is using pandas

import pandas as pd
data = pd.read_csv(fn)
print data[data.Height == 1800]
print data[data.LastName == 'Smith']

you'll have to do more research on your own, but that answers your first question.

Upvotes: 3

Related Questions