user2225190
user2225190

Reputation: 549

What is the advantage of using csv.reader over writing my own parser in python

If below two code snippets are giving same results then please let me know what is the advantage of using csv.reader

1)

import csv
f = open('a.csv', 'rb')
spamreader = csv.reader(f)
for a in spamreader:
    print a
  1. f = open('a.csv', 'rb') for a in f: print a.split(',')

Result:

['SNO', ' Name', ' Dept']
['1', ' Def', ' Electronics']
['2', 'Abc', 'Computers']

Upvotes: 2

Views: 1387

Answers (3)

S.B
S.B

Reputation: 16564

As already mentioned by others, it is useful when you have quoted elements. But I'm going to show one another brilliant feature of CSV module.

What if you receive different files from somewhere else and you don't know which delimiters they used for separating the fields? You can't predict and also you don't want to implement a logic to parse all the possible delimiters when CSV module has Sniffer class and has already implemented that for you.

sample:

line1|hey|20|40|50
line2|hey|20|40|50
line3|hey|20|40|50
line4|hey|20|40|50
line5|hey|20|50|60
line6|hey|20|50|60
...

code:

import csv

# The more you increase this value, the more accurate CSV can guess the dialect.
sample_bytes = 200
sniffer = csv.Sniffer()

with open('s.txt') as f:
    dialect_object = sniffer.sniff(f.read(sample_bytes))

    # to start from the beginning
    f.seek(0)

    reader = csv.reader(f, dialect=dialect_object)
    for line in reader:
        print(line)

output:

['line1', 'hey', '20', '40', '50']
['line2', 'hey', '20', '40', '50']
['line3', 'hey', '20', '40', '50']
['line4', 'hey', '20', '40', '50']
...

Upvotes: 1

dsh
dsh

Reputation: 12234

I clarified your question, since csv.reader() is an iterator. Your question compares the csv module with writing your own parser.

The advantage is that the csv module actually implements the CSV format (including quotes, escaping, and embedded newlines, etc.) while the naive parser you wrote does none of that. So it is more correct, and actually simpler code too!, to use the csv module.

Upvotes: 3

Martin Thoma
Martin Thoma

Reputation: 136715

In your example, I don't see an advantage of using the csv module. However, things change when you have quoted elements:

SNO,Name,Dept
1,Def,Electronics
2,Abc,Computers
3,"here is the delimiter, in quotes",ghi

With the csv module, it is simply

import csv
with open('a.csv', 'rb') as f:
    csv_reader = csv.reader(f, delimiter=',', quotechar='"')
    for row in csv_reader:
        print(row)

but splitting would ignore the quotes.

(Anyway, I recommend using pandas as shown here for reading CSV files. Please also note that you should close files you've opened. By using the with statement, you can do it implicity.)

Upvotes: 6

Related Questions