Reputation: 2487
I've successfully combined all csv files in a directory, however struggling with the ability to skip the first row (header) of each file. The error I currently get is " 'list' object is not an iterator". I have tried multiple approaches including not using the [open(thefile).read()], but still not able to get it working. Here is my code:
import glob
files = glob.glob( '*.csv' )
output="combined.csv"
with open(output, 'w' ) as result:
for thefile in files:
f = [open(thefile).read()]
next(f) ## this line is causing the error 'list' object is not an iterator
for line in f:
result.write( line )
message = 'file created'
print (message)
Upvotes: 0
Views: 2869
Reputation: 87084
Here's an alternative using the oft forgotten fileinput.input()
method:
import fileinput
from glob import glob
FILE_PATTERN = '*.csv'
output = 'combined.csv'
with open(output, 'w') as output:
for line in fileinput.input(glob(FILE_PATTERN)):
if not fileinput.isfirstline():
output.write(line)
It's quite a bit cleaner than many other solutions.
Note that the code in your question was not far off working. You just need to change
f = [open(thefile).read()]
to
f = open(thefile)
but I suggest that using with
would be better still because it will automatically close the input files:
with open(output, 'w' ) as result:
for thefile in files:
with open(thefile) as f:
next(f)
for line in f:
result.write( line )
Upvotes: 1
Reputation: 2161
>>> a = [1, 2, 3]
>>> next(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list object is not an iterator
I am not sure why you chose to bracket the read, but you should recognize what is happening from the example above.
There is already a good answer. This is just an example of how you might look at the problem. Also, I would recommend getting what you want to work with just a single file. After that is working, import glob and work on using your mini-solution in the bigger problem.
Upvotes: 0
Reputation: 174706
Use readlines()
function instead of read()
, so that you could easily skip the first line.
f = open(thefile)
m = f.readlines()
for line in m[1:]:
result.write(line.rstrip())
f.close()
OR
with open(thefile) as f:
m = f.readlines()
for line in m[1:]:
result.write(line.rstrip())
You don't need to explicitly close the file object if the file was opened through with
statement.
Upvotes: 1