Read one line from multiple files and write into one file

Question

I have about 45,000 files. My purpose is to extract one certain line from each file and accumulate them on single file.

I tried to use glob.glob, but the problem is that with this module, the order of file seems mixed.

filin= diri+ '*.out'
list_of_files = glob.glob(filin)
print list_of_files 
with open("A.txt", "w") as fout:
    for fileName in list_of_files:
        data_list = open( fileName, 'r' ).readlines()
        fout.write(data_list[12])

Above is the code I used. Mainly, I borrowed from someone elses code in this forum.

I would like to read all ".out' files in order. Each of these files contains data at one minute interval. For example, one file contains data at 2014/1/1/ 00:00 and consequent file has data at 2014/1/1/ 00:01. So reading these file in order is very important. However, when I used glob.glob and print list_of_files above, file order seems pretty mixed. Could I solve this problem?

Also, as shown above, I would like to read 12th lines from the top from each file, but result repeatedly shows "out of index".

The question seems not very organized. Any idea or help would be really appreciated.

P.S the name of files are such as:Data_201308032343.out, Data_201308032344.out, Data_201308032345.out ......

Thank you.

scandinavian_ · Accepted Answer

list_of_files = sorted(glob.glob(filin))

data_list[12] reads the 13'th line of the file because it is a zero-indexed list. That might be the cause of the "Index out of range" exception.

Read one line from multiple files and write into one file

Answers (2)

Related Questions