Sachith Muhandiram
Sachith Muhandiram

Reputation: 2972

string replace while writing from log to a csv

I am trying to convert a log file into csv using python (mainly this example)

import csv

with open('/home/user/Downloads/my.log') as file:
    lines = file.read().splitlines()
    lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

    with open('test.csv', 'w+') as csvfile:
        w = csv.writer(csvfile)
        w.writerows(lines)
        print "done"

My original log file looks like this :

2017-08-09 -> 11:30:01
Temp=29.0*  Humidity=30.0%

2017-08-09 -> 11:40:01
Temp=29.0*  Humidity=33.0%

From above example code, I can convert it into this format

2017-08-08 -> 08:50:01,Temp=33.0*  Humidity=38.0%
2017-08-08 -> 09:00:01,Temp=37.0*  Humidity=40.0%

But I need my final csv to look like this

2017-08-08,08:50:01,33.0*,38.0%, 
2017-08-08,09:00:01,37.0*,40.0%

I have used lines = lines.replace("->",",") and I got

AttributeError: 'list' object has no attribute 'replace'

As I understood, python can not replace text file its being treated in memory. So what I can do? what method I can use to purify the final text?

My python knowledge is not advance, still learning. So if there a mistake or missed step, please correct me.

Thanks in advance

Upvotes: 1

Views: 136

Answers (5)

AndrewR
AndrewR

Reputation: 168

This creates a list of strings:

lines = file.read().splitlines()

And then this creates a list of lists of strings:

lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

Replace works on a string, not a list. There are many ways to solve this:

# 1. do replace on original string, before splitting.
lines = file.read().replace("->", ",") 

# 2. or do replace on elements of list, before creating list of lists
lines = file.read().splitlines()
lines = [i.replace("->",",") for i in lines]

# 3. or replace on each element in list of lists 
# (not implemented)

Upvotes: 2

Anton vBR
Anton vBR

Reputation: 18916

Update: Full modified code

with open('/home/user/Downloads/my.log') as file, open("output.csv", "w") as outfile:
    lines = file.readlines()
    lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
    lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]

    outfile.write("Date,Time,Temp,Humidity")

    for line in lines:
        outfile.write(line)

Outputs

Date,Time,Temp,Humidity
2017-08-08,08:50:01,33.0*,38.0%
2017-08-08,09:00:01,37.0*,40.0%

Wokring Example

datafile="""2017-08-09 -> 11:30:01
Temp=29.0*  Humidity=30.0%

2017-08-09 -> 11:40:01
Temp=29.0*  Humidity=33.0%"""

lines = io.StringIO(datafile).readlines()
lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]
lines

Returns:

['2017-08-09,11:30:01,29.0*,30.0%', '2017-08-09,11:40:01,29.0*,33.0%']

Upvotes: 2

RagingRoosevelt
RagingRoosevelt

Reputation: 2164

Are you open to using regular expressions? If so, extract groups 1, 2, 3, and 4 from each string with the following pattern:

(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)

(see it working here)

You'd use the re library.

import csv
import re

with open('/home/user/Downloads/my.log') as file:
    lines = file.read().splitlines()
    lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

    pattern = '(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)'

    with open("output.csv", "w") as f:
        f.write("Date,Time,Temp,Humidity\n")
        print("Date,Time,Temp,Humidity")
        for line in lines:
            m = re.search(pattern, line)
            f.write("{}, {}, {}, {}\n".format(m.group(1), m.group(2), m.group(3), m.group(4)))
            print("{}, {}, {}, {}".format(m.group(1), m.group(2), m.group(3), m.group(4)))

Upvotes: 1

Netwave
Netwave

Reputation: 42746

You are using replace as a list method since lines is a list containing the strings, if you want to transform it use map or a list comprehension:

lines = [i.replace("->",",") for i in lines]

or

lines = map(lambda x: x.replace("->",","), lines)

You can chain the transformations to get the final result:

lines = map(lambda x: x.replace("->",",").replace("Temp=", "").replace("  Humidity=", ",").replace(" ", ""), 
            lines)

Here you have a live example

Upvotes: 1

trivelt
trivelt

Reputation: 1965

replace() method can be invoked on string, not on list, so lines.replace("->",",") is incorrect. You can operate on a whole text from file, and next split it into lines:

with open('/home/user/Downloads/my.log') as file:
    text = file.read().replace("->", ",")
    lines = text.splitlines()
    lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

Upvotes: 1

Related Questions