D3181
D3181

Reputation: 2092

Access column using dictreader

The problem:

I have been having an issue trying to find the average of a column from a csv file using python's dictreader.

I have tried:

Accessing the columns like this using the column name, this works but the column name is required and im unsure how to loop over the reader.fieldnames in a way to construct a list from just each single column rather than mixing all columns data into the same list :

          for r in reader:
            print(r.get("Price"))

Example of the loop

for i in reader.fieldnames:
    for r in reader:
        print(row.get(i))

This is fine, however prints out 1 element from each column for each row. This makes it difficult to assemble a list of say all prices, all names etc as it would just rebuild the dictreader in list form.

Question

How can i read just a single entire column from dictreader so i can access each column individually as a list and perform operations on it?

Note: so far i have tried appending each element using the loop, but results in a N size array with 4 elements in each row.

Upvotes: 1

Views: 9764

Answers (3)

wwii
wwii

Reputation: 23773

data.csv:
'''
one, two, three
1,2,3
4,5,6
7,8,9
10,11,12
'''

Use a plain reader object, get the headers, transpose the data, combine the headers with the data to create a dict.

import csv
with open('data.csv') as f:
    reader = csv.reader(f)
    headers = next(reader)
    # transpose the data
    # --> columns become rows and rows become columns
    data = zip(*reader)
    # create a dictionary by combining the headers with the data
    d = dict(zip(headers, data))

>>> from pprint import pprint
>>> pprint(d)
{' three': ('3', '6', '9', '12'),
 ' two': ('2', '5', '8', '11'),
 'one': ('1', '4', '7', '10')}
>>> 

Upvotes: 1

code monkey
code monkey

Reputation: 2124

You could use the pandas module. It is very powerful and can deal with csv files.

import pandas as pd
df = pd.read_csv(csv_file)
saved_column = df['column_name']

Upvotes: 3

Tore Eschliman
Tore Eschliman

Reputation: 2507

If you're fine looping over your file once for each column you want to read, just build a dict comprehension of list comprehensions:

columns = {fieldname: [row.get(fieldname) for row in reader] for fieldname in reader.fieldnames}

There's not really a better way to do it, just based on the nature of the file... csv's are a series of rows, turning them into columns is gonna be a little wasteful. You can tinker with this if you only want certain fieldnames extracted.

If you really need to only read the file once, though:

columns = {}
for row in reader:
    for fieldname in reader.fieldnames:
        columns.setdefault(fieldname, []).append(row.get(fieldname))

Upvotes: 2

Related Questions