Abruzzo Forte e Gentile
Abruzzo Forte e Gentile

Reputation: 14869

reading csv file without for

I need to read a CSV file in python.

Since for last row I receive a 'NULL byte' error I would like to avoid using for keyword but the while.

Do you know how to do that?

    reader = csv.reader( file )
    for row in reader  # I have an error at this line
          # do whatever with row

I want to substitute the for-loop with a while-loop so that I can check if the row is NULL or not.

What is the function for reading a single row in the CSV module? Thanks

Thanks

p.S. below the traceback

Traceback (most recent call last):
  File "FetchNeuro_TodayTrades.py", line 189, in 
    for row in reader:
_csv.Error: line contains NULL byte

Upvotes: 12

Views: 18841

Answers (8)

terry
terry

Reputation: 43

Process the initial csv file and replace the Nul '\0' with blank, and then you can read it. The actual code looks like this:

data_initial = open(csv_file, "rU")
reader = csv.reader((line.replace('\0','') for line in data_initial))

It works for me.

And the original answer is here:csv-contain null byte

Upvotes: 0

John Fouhy
John Fouhy

Reputation: 42183

You could try cleaning the file as you read it:

def nonull(stream):
    for line in stream:
        yield line.replace('\x00', '')

f = open(filename)
reader = csv.reader(nonull(f))

Assuming, of course, that simply ignoring NULL characters will work for you!

Upvotes: 2

John Machin
John Machin

Reputation: 82924

You need (always) to say EXACTLY what is the error message that you got. Please edit your question.

Probably this:

>>> import csv; csv.reader("\x00").next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_csv.Error: line contains NULL byte
>>>

The csv module is not 8-bit clean; see the docs: """Also, there are currently some issues regarding ASCII NUL characters."""

The error message is itself in error: it should be "NUL", not "NULL" :-(

If the last line in the file is empty, you won't get an exception, you'll merely get row == [].

Assuming the problem is one or more NULs in your file(s), you'll need to (1) speak earnestly to the creator(s) of your file(s) (2) failing that, read the whole file in (mode="rb"), strip out the NUL(s), and feed fixed_text.splitlines() to the csv reader.

Upvotes: 4

telliott99
telliott99

Reputation: 7907

I don't have an answer, but I can confirm the problem, and that most answers posted don't work. You cannot catch this exception. You cannot test for if line. Maybe you could check for the NULL byte directly, but I'm not swift enough to do that... If it is always on the last line, you could of course skip that.

import csv
FH = open('data.csv','wb')
line1 = [97,44,98,44,99,10]
line2 = [100,44,101,44,102,10]
for n in line1 + line2:
    FH.write(chr(n))
FH.write(chr(0))
FH.close()
FH = open('data.csv')
reader = csv.reader(FH)
for line in reader:
    if '\0' in line:  continue
    if not line:  continue
    print line

$ python script.py 
['a', 'b', 'c']
['d', 'e', 'f']
Traceback (most recent call last):
  File "script.py", line 11, in <module>
    for line in reader:
_csv.Error: line contains NULL byte

Upvotes: 0

dalloliogm
dalloliogm

Reputation: 8940

If your problem is specific to the last line being empty, you can use numpy.genfromtxt (or the old matplotlib.mlab.csv2rec)

$: cat >csv_file.txt
foo,bar,baz
yes,no,0
x,y,z



$:
$: ipython
>>> from numpy import genfromtxt
>>> genfromtxt("csv_file.txt", dtype=None, delimiter=',')
array([['foo', 'bar', 'baz'],
       ['yes', 'no', '0'],
       ['x', 'y', 'z']], 
      dtype='|S3')

Upvotes: 1

Pedro Ghilardi
Pedro Ghilardi

Reputation: 1964

Maybe you could catch the exception raised by the CSV reader. Something like this:

filename = "my.csv"
reader = csv.reader(open(filename))
try:
    for row in reader:
        print 'Row read with success!', row
except csv.Error, e:
    sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))

Or you could use next():

while True:
    try: 
        print reader.next()
    except csv.Error:
        print "Error"
    except StopIteration:
        print "Iteration End"
        break

Upvotes: 20

Dave Everitt
Dave Everitt

Reputation: 17866

The Django community has addressed Python CSV import issues, so it might be worth searching for CSV import there, or posting a question. Also, you could edit the offending line directly in the CSV file before trying the import.

Upvotes: 2

ghostdog74
ghostdog74

Reputation: 342313

not really sure what you mean, but you can always check for existence with if

>>> reader = csv.reader("file")
>>> for r  in reader:
...   if r: print r
...

if this is not what you want, you should describe your problem more clearly by showing examples of things that doesn't work for you, including sample file format and desired output you want.

Upvotes: 0

Related Questions