How to split a log file into several csv files with python

Question

I'm pretty new to python and coding in general, so sorry in advance for any dumb questions. My program needs to split an existing log file into several *.csv files (run1,.csv, run2.csv, ...) based on the keyword 'MYLOG'. If the keyword appears it should start copying the two desired columns into the new file till the keyword appears again. When finished there need to be as many csv files as there are keywords.

53.2436     EXP     MYLOG: START RUN specs/run03_block_order.csv
53.2589     EXP     TextStim: autoDraw = None
53.2589     EXP     TextStim: autoDraw = None
55.2257     DATA    Keypress: t
57.2412     DATA    Keypress: t
59.2406     DATA    Keypress: t
61.2400     DATA    Keypress: t
63.2393     DATA    Keypress: t
...
89.2314     EXP     MYLOG: START BLOCK scene [specs/run03_block01.csv]
89.2336     EXP     Imported specs/run03_block01.csv as conditions
89.2339     EXP     Created sequence: sequential, trialTypes=9
...

[EDIT]: The output per file (run*.csv) should look like this:

onset       type
53.2436     EXP     
53.2589     EXP     
53.2589     EXP     
55.2257     DATA    
57.2412     DATA    
59.2406     DATA    
61.2400     DATA    
...

The program creates as much run*.csv as needed, but i can't store the desired columns in my new files. When finished, all I get are empty csv files. If I shift the counter variable to == 1 it creates just one big file with the desired columns.

Thanks again!

import csv

QUERY = 'MYLOG'

with open('localizer.log', 'rt') as log_input:
i = 0

for line in log_input:

    if QUERY in line:
        i = i + 1

        with open('run' + str(i) + '.csv', 'w') as output:
            reader = csv.reader(log_input, delimiter = ' ')
            writer = csv.writer(output)
            content_column_A = [0]
            content_column_B = [1]

            for row in reader:
                content_A = list(row[j] for j in content_column_A)
                content_B = list(row[k] for k in content_column_B)
                writer.writerow(content_A)
                writer.writerow(content_B)

Geekfish · Accepted Answer

Looking at the code there's a few things that are possibly wrong:

the csv reader should take a file handler, not a single line.
the reader delimiter should not be a single space character as it looks like the actual delimiter in your logs is a variable number of multiple space characters.
the looping logic seems to be a bit off, confusing files/lines/rows a bit.

You may be looking at something like the code below (pending clarification in the question):

import csv
NEW_LOG_DELIMITER = 'MYLOG'

def write_buffer(_index, buffer):
    """
    This function takes an index and a buffer.
    The buffer is just an iterable of iterables (ex a list of lists)
    Each buffer item is a row of values.
    """
    filename = 'run{}.csv'.format(_index)
    with open(filename, 'w') as output:
        writer = csv.writer(output)
        writer.writerow(['onset', 'type'])  # adding the heading
        writer.writerows(buffer)

current_buffer = []
_index = 1

with open('localizer.log', 'rt') as log_input:
    for line in log_input:
        # will deal ok with multi-space as long as
        # you don't care about the last column
        fields = line.split()[:2]
        if not NEW_LOG_DELIMITER in line or not current_buffer:
            # If it's the first line (the current_buffer is empty)
            # or the line does NOT contain "MYLOG" then
            # collect it until it's time to write it to file.
            current_buffer.append(fields)
        else:
            write_buffer(_index, current_buffer)
            _index += 1
            current_buffer = [fields]  # EDIT: fixed bug, new buffer should not be empty
    if current_buffer:
        # We are now out of the loop,
        # if there's an unwritten buffer then write it to file.
        write_buffer(_index, current_buffer)

How to split a log file into several csv files with python

Answers (2)

Related Questions