DBWeinstein
DBWeinstein

Reputation: 9499

IndexError: list index out of range csv reader python

I have the following csv called report.csv. It's an excel file:

email   agent_id    misc
[email protected]  65483843154f35d54   blah1
[email protected] sldd989eu99ufj9ej9e blah 2

I have the following code:

import csv

data_file =  'report.csv'
def import_data(data_file):
    attendee_data = csv.reader(open(data_file, 'rU'), dialect=csv.excel_tab)
    for row in attendee_data:
        email = row[1]
        agent_id = row[2]
        pdf_file_name = agent_id + '_' + email + '.pdf'
        generate_certificate(email, agent_id, pdf_file_name)

I get the following error:

Traceback (most recent call last):
File "report_test.py", line 56, in <module>
import_data(data_file)
File "report_test.py", line 25, in import_data
email = row[1]
IndexError: list index out of range

I thought the index was the number of columns in, within each row. row[1] and 'row[2]` should be within range, no?

Upvotes: 0

Views: 5757

Answers (2)

That1Guy
That1Guy

Reputation: 7233

You say you have an "Excel CSV", which I don't quite understand so I'll answer assuming you have an actual .csv file.

If I'm loading a .csv into memory (and the file isn't enormous), I'll often have a load_file method on my class that doesn't care about indexes.

Assuming the file has a header row:

import csv

def load_file(filename):

    # Define data in case the file is empty.
    data = []
    with open(filename) as csvfile:
        reader = csv.reader(csvfile)
        headers = next(reader)
        data = [dict(zip(headers, row)) for row in reader]

    return data

This returns a list of dictionaries you can use by key, instead of index. The key will be absent in the event, say misc is missing from the row (index 2), so simply .get from the row. This is cleaner than a try...except.

for row in data:
    email = row.get('email')
    agent_id = row.get('agent_id')
    misc = row.get('misc')

This way the order of the file columns don't matter, only the headers do. Also, if any of the columns have a blank value, your script won't error out by giving an IndexError. If you don't want to include blank values, simply handle them by checking:

if not email:
    do.something()
if not agent_id:
    do.something_else()

Upvotes: 0

Luke
Luke

Reputation: 5708

There is most likely a blank line in your CSV file. Also, list indices start at 0, not 1.

import csv

data_file =  'report.csv'
def import_data(data_file):
    attendee_data = csv.reader(open(data_file, 'rU'), dialect=csv.excel_tab)
    for row in attendee_data:
        try:
            email = row[0]
            agent_id = row[1]
        except IndexError:
            pass
        else:
            pdf_file_name = agent_id + '_' + email + '.pdf'
            generate_certificate(email, agent_id, pdf_file_name)

Upvotes: 2

Related Questions