Megool
Megool

Reputation: 999

DictReader get a value when a column # and row # are known

I am given a csv file that looks something like this

ID, name, age, city
1, Andy, 25, Ann Arbor
2, Bella, 40, Los Angeles
3, Cathy, 13, Eureka
...
...

If I want to get the city of ID=3, which would be Eureka for this example. Is there a way to do this efficiently instead of iterating each row? My php code will be executing this python script each time to get the value, and I feel like being very inefficient to loop through the csv file every time.

Upvotes: 0

Views: 5273

Answers (3)

Burhan Khalid
Burhan Khalid

Reputation: 174672

If I want to get the city of ID=3, which would be Eureka for this example. Is there a way to do this efficiently instead of iterating each row? My php code will be executing this python script each time to get the value, and I feel like being very inefficient to loop through the csv file every time.

Your ideal solution is to wrap this Python code into an API that you can call from your PHP code.

On startup, the Python code would load the file into a data structure, and then wait for your request.

If the file is very big, your Python script would load it into a database and read from there.

You can then choose to return either a string, or a json object.

Here is a sample, using Flask:

import csv
from flask import Flask, request, abort

with open('somefile.txt') as f:
   reader = csv.DictReader(f, delimiter=',')
   rows = list(reader)
   keys = row[0].keys()

app = Flask(__name__)

@app.route('/<id>')
@app.route('/')
def get_item():
    if request.args.get('key') not in keys:
        abort(400) # this is an invalid request
    key = request.args.get('key')
    try:
        result = next(i for i in rows if i['id'] == id)
    except StopIteration:
        # ID passed doesn't exist
        abort(400)
    return result[key]

if __name__ == '__main__':
    app.run()

You would call it like this:

http://localhost:5000/3?key=city

Upvotes: 0

Michael Blakeley
Michael Blakeley

Reputation: 346

In a word: no.

As yurib mentioned, one method is to convert your files to JSON and go from there, or just to dump to a dict. This gives you the ability to do things like pickle if you need to serialize your dataset, or shelve if you want to stash it someplace for later use.

Another option is to dump your CSV into a queryable database by way of using something like Python's built-in sqlite3 support. It depends on where you want your overhead to lie: pre-processing your data in this manner saves you from having to parse a large file every time your script runs.

Check out this answer for a quick rundown.

Upvotes: 0

yurib
yurib

Reputation: 8147

iterate over the file once and save the data into a dictionary:

data = {}
with open('input.csv') as fin:
    reader = csv.DictReader(fin)
    for record in reader:
        data[record['ID']] = {k:v for k,v in record.items() if k <> 'ID'}

then just access the required key in the dictionary:

print data[3]['city'] # Eureka

in case you want to persist the data in the key:value format you can save it as a json file:

import json
import csv

j = {}
with open('input.csv') as fin:
    reader = csv.DictReader(fin)
    for record in reader:
        j[record['ID']] = {k:v for k,v in record.items() if k <> 'ID'}
with open('output.json','w') as fout:
    json.dump(j,fout)

Upvotes: 3

Related Questions