sharataka
sharataka

Reputation: 5132

How to correctly read csv and input into list?

I am trying to read a bunch of data in .csv file into an array in format: [ [a,b,c,d], [e,f,g,h], ...]

Running the code below, when I print an entry with a space (' ') the way I'm accessing the element isn't correct because it stops at the first space (' '). For example if Business, Fast Company, Youtube, fastcompany is the 10th entry...when I print the below I get on separate lines: Business,Fast Company,YouTube,FastCompany

Any advice on how to get as the result: [ [a,b,c,d], [Business, Fast Company, Youtube, fastcompany], [e,f,g,h], ...]?

import csv

partners = []
partner_dict = {}
i=9
with open('partners.csv', 'rb') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
    for row in spamreader:
        partners.append(row)

    print len(partners)

    for entry in partners[i]:
        print entry

Upvotes: 0

Views: 319

Answers (2)

There are a few issues with your code:

  • The "correct" syntax for iterating over a list is for entry in partners:, not for entry in partners[i]:
  • The partners_dict variable in your code seems to be unused, I assume you'll use it later, so I'll ignore it for now
  • You're opening a text file as binary (use open(file_name, "r") instead of open(file_name, "rb")
  • Your handling of the processed data is still done inside of the context manager (with ... [as ...]:-block)
  • Your input text seems to delimit by ", ", but you delimit by " " when parsing

If I understood your question right your problem seems to be caused by the last one. The "obvious solution" would probably be to change the delimeter argument to ", ", but only single-char strings are allowed as delimiters by the module. So what do we do? Well, since "," is really the "true" delimiter (it's never supposed to be inside actual unquoted data, contrary to spaces), that would seem like a good solution. However, now all your values start with " " which is probably not what you want. So what do you do? Well, all strings have a pretty neat strip() method which by default removes all whitespace in the beginning and end of the string. So, to strip() all the values, let's use a "list comprehension" (evaluates an expression on all items in a list and then returns a new list with the new values) which should look somewhat like [i.strip() for i in row] before appending it to partners.

In the end your code should hopefully look somewhat like this:

import csv

partners = []

with open('partners.csv', 'r') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
    for row in spamreader:
        partners.append([i.strip() for i in row])

print len(partners)

for entry in partners:
    print entry

Upvotes: 0

Larry Lustig
Larry Lustig

Reputation: 50970

The delimiter argument specifies which character to use to split each row of the file into separate values. Since you're passing ' ' (a space), the reader is splitting on spaces.

If this is really a comma-separated file, use ',' as the delimiter (or just leave the delimiter argument out and it will default to ',').

Also, the pipe character is an unusual value for the quote character. Is it really true that your input file contains pipes in place of quotes? The sample data you supplied contains neither pipes nor quotes.

Upvotes: 2

Related Questions