Reputation: 2016
I am a bit stuck and hope you can help.
I am trying to count the total lines within files in a directory (and all sub directories).
So we get data in hourly, which is partitioned into folders like this
DATE>HOUR>COMPANY
So, I want to do a count for all files within a date and hence need to count the lines in all files within all directories.
I can do this for a single file with the below, but I have been unable to make a multi file one work
Can anyone advise :)
count = len(open('Desktop/travel.csv').readlines( ))
Thisis what I tried for all files:
In [11]: os.chdir(Desktop)
...: names={}
...: count= 0
...: for fn in glob.glob(‘*.csv’):
...: countfile = len(open(f).readlines( ))
...: count = count + countfile
File "<ipython-input-11-2e1a69754276>", line 4
for fn in glob.glob(‘*.csv’):
But I get
for fn in glob.glob(‘*.csv’):
^
SyntaxError: invalid syntax
Upvotes: 1
Views: 64
Reputation: 2016
The first post was right, there was something strange with the formatting.
This works:
Thanks!!
In [21]: import os
...: import glob
...:
...: count= 0
...: for file in glob.glob('*.csv'):
...: countfile = len(open(file).readlines( ))
...: count = count + countfile
...:
In [22]: count
Out[22]: 709343
Upvotes: 1