Reputation: 999
I am given a csv file that looks something like this
ID, name, age, city
1, Andy, 25, Ann Arbor
2, Bella, 40, Los Angeles
3, Cathy, 13, Eureka
...
...
If I want to get the city
of ID
=3, which would be Eureka for this example. Is there a way to do this efficiently instead of iterating each row? My php code will be executing this python script each time to get the value, and I feel like being very inefficient to loop through the csv file every time.
Upvotes: 0
Views: 5273
Reputation: 174672
If I want to get the city of ID=3, which would be Eureka for this example. Is there a way to do this efficiently instead of iterating each row? My php code will be executing this python script each time to get the value, and I feel like being very inefficient to loop through the csv file every time.
Your ideal solution is to wrap this Python code into an API that you can call from your PHP code.
On startup, the Python code would load the file into a data structure, and then wait for your request.
If the file is very big, your Python script would load it into a database and read from there.
You can then choose to return either a string, or a json object.
Here is a sample, using Flask
:
import csv
from flask import Flask, request, abort
with open('somefile.txt') as f:
reader = csv.DictReader(f, delimiter=',')
rows = list(reader)
keys = row[0].keys()
app = Flask(__name__)
@app.route('/<id>')
@app.route('/')
def get_item():
if request.args.get('key') not in keys:
abort(400) # this is an invalid request
key = request.args.get('key')
try:
result = next(i for i in rows if i['id'] == id)
except StopIteration:
# ID passed doesn't exist
abort(400)
return result[key]
if __name__ == '__main__':
app.run()
You would call it like this:
http://localhost:5000/3?key=city
Upvotes: 0
Reputation: 346
In a word: no.
As yurib mentioned, one method is to convert your files to JSON and go from there, or just to dump to a dict
. This gives you the ability to do things like pickle
if you need to serialize your dataset, or shelve
if you want to stash it someplace for later use.
Another option is to dump your CSV into a queryable database by way of using something like Python's built-in sqlite3
support. It depends on where you want your overhead to lie: pre-processing your data in this manner saves you from having to parse a large file every time your script runs.
Check out this answer for a quick rundown.
Upvotes: 0
Reputation: 8147
iterate over the file once and save the data into a dictionary:
data = {}
with open('input.csv') as fin:
reader = csv.DictReader(fin)
for record in reader:
data[record['ID']] = {k:v for k,v in record.items() if k <> 'ID'}
then just access the required key in the dictionary:
print data[3]['city'] # Eureka
in case you want to persist the data in the key:value format you can save it as a json
file:
import json
import csv
j = {}
with open('input.csv') as fin:
reader = csv.DictReader(fin)
for record in reader:
j[record['ID']] = {k:v for k,v in record.items() if k <> 'ID'}
with open('output.json','w') as fout:
json.dump(j,fout)
Upvotes: 3