jKraut
jKraut

Reputation: 2487

Combine csv in Python with skipping header row Error

I've successfully combined all csv files in a directory, however struggling with the ability to skip the first row (header) of each file. The error I currently get is " 'list' object is not an iterator". I have tried multiple approaches including not using the [open(thefile).read()], but still not able to get it working. Here is my code:

 import glob
 files = glob.glob( '*.csv' )
 output="combined.csv"

 with open(output, 'w' ) as result:
     for thefile in files:
         f = [open(thefile).read()]
         next(f)   ## this line is causing the error 'list' object is not an iterator

         for line in f:
             result.write( line )
 message = 'file created'
 print (message)  

Upvotes: 0

Views: 2869

Answers (3)

mhawke
mhawke

Reputation: 87084

Here's an alternative using the oft forgotten fileinput.input() method:

import fileinput
from glob import glob

FILE_PATTERN = '*.csv'
output = 'combined.csv'

with open(output, 'w') as output:
    for line in fileinput.input(glob(FILE_PATTERN)):
        if not fileinput.isfirstline():
            output.write(line)

It's quite a bit cleaner than many other solutions.

Note that the code in your question was not far off working. You just need to change

f = [open(thefile).read()]

to

f = open(thefile)

but I suggest that using with would be better still because it will automatically close the input files:

with open(output, 'w' ) as result:
    for thefile in files:
        with open(thefile) as f:
            next(f)
            for line in f:
                result.write( line )

Upvotes: 1

Fred Mitchell
Fred Mitchell

Reputation: 2161

>>> a = [1, 2, 3]
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list object is not an iterator

I am not sure why you chose to bracket the read, but you should recognize what is happening from the example above.

There is already a good answer. This is just an example of how you might look at the problem. Also, I would recommend getting what you want to work with just a single file. After that is working, import glob and work on using your mini-solution in the bigger problem.

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174706

Use readlines() function instead of read(), so that you could easily skip the first line.

f = open(thefile)
m = f.readlines()
for line in m[1:]:
    result.write(line.rstrip())
f.close()

OR

with open(thefile) as f:
    m = f.readlines()
    for line in m[1:]:
        result.write(line.rstrip())

You don't need to explicitly close the file object if the file was opened through with statement.

Upvotes: 1

Related Questions