Reputation: 2972
I am trying to convert a log file into csv using python (mainly this example)
import csv
with open('/home/user/Downloads/my.log') as file:
lines = file.read().splitlines()
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
with open('test.csv', 'w+') as csvfile:
w = csv.writer(csvfile)
w.writerows(lines)
print "done"
My original log file looks like this :
2017-08-09 -> 11:30:01
Temp=29.0* Humidity=30.0%
2017-08-09 -> 11:40:01
Temp=29.0* Humidity=33.0%
From above example code, I can convert it into this format
2017-08-08 -> 08:50:01,Temp=33.0* Humidity=38.0%
2017-08-08 -> 09:00:01,Temp=37.0* Humidity=40.0%
But I need my final csv to look like this
2017-08-08,08:50:01,33.0*,38.0%,
2017-08-08,09:00:01,37.0*,40.0%
I have used lines = lines.replace("->",",")
and I got
AttributeError: 'list' object has no attribute 'replace'
As I understood, python can not replace text file its being treated in memory. So what I can do? what method I can use to purify the final text?
My python knowledge is not advance, still learning. So if there a mistake or missed step, please correct me.
Thanks in advance
Upvotes: 1
Views: 136
Reputation: 168
This creates a list of strings:
lines = file.read().splitlines()
And then this creates a list of lists of strings:
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
Replace works on a string, not a list. There are many ways to solve this:
# 1. do replace on original string, before splitting.
lines = file.read().replace("->", ",")
# 2. or do replace on elements of list, before creating list of lists
lines = file.read().splitlines()
lines = [i.replace("->",",") for i in lines]
# 3. or replace on each element in list of lists
# (not implemented)
Upvotes: 2
Reputation: 18916
Update: Full modified code
with open('/home/user/Downloads/my.log') as file, open("output.csv", "w") as outfile:
lines = file.readlines()
lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]
outfile.write("Date,Time,Temp,Humidity")
for line in lines:
outfile.write(line)
Outputs
Date,Time,Temp,Humidity
2017-08-08,08:50:01,33.0*,38.0%
2017-08-08,09:00:01,37.0*,40.0%
Wokring Example
datafile="""2017-08-09 -> 11:30:01
Temp=29.0* Humidity=30.0%
2017-08-09 -> 11:40:01
Temp=29.0* Humidity=33.0%"""
lines = io.StringIO(datafile).readlines()
lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]
lines
Returns:
['2017-08-09,11:30:01,29.0*,30.0%', '2017-08-09,11:40:01,29.0*,33.0%']
Upvotes: 2
Reputation: 2164
Are you open to using regular expressions? If so, extract groups 1, 2, 3, and 4 from each string with the following pattern:
(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)
You'd use the re
library.
import csv
import re
with open('/home/user/Downloads/my.log') as file:
lines = file.read().splitlines()
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
pattern = '(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)'
with open("output.csv", "w") as f:
f.write("Date,Time,Temp,Humidity\n")
print("Date,Time,Temp,Humidity")
for line in lines:
m = re.search(pattern, line)
f.write("{}, {}, {}, {}\n".format(m.group(1), m.group(2), m.group(3), m.group(4)))
print("{}, {}, {}, {}".format(m.group(1), m.group(2), m.group(3), m.group(4)))
Upvotes: 1
Reputation: 42746
You are using replace
as a list method since lines
is a list containing the strings, if you want to transform it use map
or a list comprehension
:
lines = [i.replace("->",",") for i in lines]
or
lines = map(lambda x: x.replace("->",","), lines)
You can chain the transformations to get the final result:
lines = map(lambda x: x.replace("->",",").replace("Temp=", "").replace(" Humidity=", ",").replace(" ", ""),
lines)
Here you have a live example
Upvotes: 1
Reputation: 1965
replace()
method can be invoked on string, not on list, so lines.replace("->",",")
is incorrect. You can operate on a whole text from file, and next split it into lines:
with open('/home/user/Downloads/my.log') as file:
text = file.read().replace("->", ",")
lines = text.splitlines()
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
Upvotes: 1