Matt Leonard
Matt Leonard

Reputation: 31

Copy select lines from many text files and paste to new file

I'm new to Python and trying to use it to do what I think should be a very simple task. I have a folder with many .log files, which each have many lines of data. I want to copy the lines which only contain a certain key word, and paste every line from each file in to one master file that I can open in excel. I've been searching for an answer, and I just can't quite seem to get anything to work.

Upvotes: 1

Views: 3783

Answers (2)

yemu
yemu

Reputation: 28259

import os

outfile = open("outfile.txt", "w")
temp_list = []
for cur_file in os.listdir("."):
    if cur_file.endswith(".log"):
        for line in open(cur_file, "r").readlines():
            if "KEYWORD" in line:
                outfile.write(line)
outfile.close()

Upvotes: 1

piokuc
piokuc

Reputation: 26184

This should do what you need. Put file with this code in the directory where you have your .log files, replace KEYWORD with what you are actually looking for, and run it.

import os
theKeyword = 'KEYWORD'
directory = '.' 
with open('output.csv', 'w') as out:
    for file in os.listdir(directory):
        if file.endswith(".log"):
            with open(file, 'r') as f:
                for line in f:
                    if theKeyword in line:
                        out.write(line)

As suggested, you can use glob instead of os.listdir:

from glob import glob
with open('output.csv', 'w') as out:
    for file in glob('*.log'):
        with open(file, 'r') as f:
            for line in f:
                if 'KEYWORD' in line:
                    out.write(line)

The code can be even a bit simpler if you use fileinput module:

from glob import glob
import fileinput
with open('output.csv', 'w') as out:
    for line in fileinput.input(glob('*.log')):
        if 'KEYWORD' in line:
            out.write(line)

Another variation of the 'grep in Python' thing:

from glob import glob
import fileinput
with open('output.csv', 'w') as out:
    out.writelines(line for line in fileinput.input(glob('*.log')) if 'KEYWORD' in line)

In the above snippet, if you remove fileinput.input's argument, then it will process sys.argv[1:], so you can run your script with file names as parameters.

In case you'd like to search for files recursively in subdirectories of a directory, you should have a look at os.walk function.

If you have a Linux/Unix/Mac box, or if you have Cygwin installed on a Windows box, the same can be achieved a bit easier using shell tools:

$ cat *.log| grep KEYWORD >output.csv

Upvotes: 6

Related Questions