user1825241
user1825241

Reputation: 906

How to extract certain csv data based on the header in python

How would I extract specific data from a csv file, based on the header in python? For example, say the csv file contained this information:

Height,Weight,Age
6.0,78,25

How could I retrieve just the age in python?

Upvotes: 5

Views: 13396

Answers (2)

mfitzp
mfitzp

Reputation: 15545

The process to follow is: read in the first line, find the index (location) on that line of the data you're looking for, then use that index to pull the data out of the remaining lines.

Python offers a very helpful csv.reader class for doing all the reading, so it's quite simple.

import csv

filename = 'yourfilenamehere'
column = 'Age'

data = [] # This will contain our data

# Create a csv reader object to iterate through the file
reader = csv.reader( open( filename, 'rU'), delimiter=',', dialect='excel')

hrow = reader.next() # Get the top row
idx = hrow.index(column) # Find the column of the data you're looking for

for row in reader: # Iterate the remaining rows
    data.append( row[idx] )

print data

Note that the values will come out as strings. You can convert to int by wrapping the row[idx] e.g. data.append( int( row[idx] ) )

Upvotes: 2

DSM
DSM

Reputation: 353119

I second the csv recommendation, but I think here using csv.DictReader would be simpler:

(Python 2):

>>> import csv
>>> with open("hwa.csv", "rb") as fp:
...     reader = csv.DictReader(fp)
...     data = next(reader)
...     
>>> data
{'Age': '25', 'Weight': '78', 'Height': '6.0'}
>>> data["Age"]
'25'
>>> float(data["Age"])
25.0

Here I've used next just to get the first row, but you could loop over the rows and/or extract a full column of information if you liked.

Upvotes: 6

Related Questions