Reputation: 15
I have a data output file in the format below from the script I run.
1. xxx %percentage1
2. yyy %percentage1
.
.
.
I am trying to take the percentages only, and append them to the same formatted file line by line (writing a new file once in the process).
1. xxx %percentage1 %percentage2
2. yyy %percentage1 %percentage2
The main idea is every time I run the code with a source data file I want it to add those percentages to the new file line by line.
1. xxx %percentage1 %percentage2 %percentage3 ...
2. yyy %percentage1 %percentage2 %percentage3 ...
This is what I could come up with:
import os
os.chdir("directory")
f = open("data1", "r")
n=3
a = f.readlines()
b = []
for i in range(n):
b.append(a[i].split(" ")[2])
file_lines = []
with open("data1", 'r') as f:
for t in range(n):
for x in f.readlines():
file_lines.append(''.join([x.strip(), b[t], '\n']))
print(b[t])
with open("data2", 'w') as f:
f.writelines(file_lines)
With this code I get the new file but the appending percentages are all from the first line, not different for each line. And I can only get one set of percentages added only and it is overwriting it rather than adding more down the lines.
I hope I explained it properly, if you can give some help I would be glad.
Upvotes: 0
Views: 49
Reputation: 3775
You can use a dict as a structure to load and write your data. This dict can then be pickled to store the data.
EDIT: added missing return statement
EDIT2: Fix return list of get_data
import pickle
import os
output = 'output'
dump = 'dump'
output_dict = {}
if os.path.exists(dump):
with open(dump, 'rb') as f:
output_dict = pickle.load(f)
def read_data(lines):
""" Builds a dict from a list of lines where the keys are
a tuple(w1, w2) and the values are w3 where w1, w2 and w3
are the 3 words composing each line.
"""
d = {}
for line in lines:
elts = line.split()
assert(len(elts)==3)
d[tuple(elts[:2])] = elts[2]
return d
def get_data(data):
""" Recover data from a dict as a list of strings.
The formatting for each element of the list is the following:
k[0] k[1] v
where k and v are the key/values of the data dict.
"""
lines = []
for k, v in data.items():
line = list(k)
line += [v, '\n']
lines.append(' '.join(line))
return lines
def update_data(output_d, new_d):
""" Update a data dict with new data
The values are appended if the key already exists.
Otherwise a new key/value pair is created.
"""
for k, v in new_d.items():
if k in output_d:
output_d[k] = ' '.join([output_d[k], v])
else:
output_d[k] = v
for data_file in ('data1', 'data2', 'data3'):
with open(data_file) as f:
d1 = read_data(f.readlines())
update_data(output_dict, d1)
print("Dumping data", output_dict)
with open(dump, 'wb') as f:
pickle.dump(output_dict, f)
print("Writing data")
with open(output, 'w') as f:
f.write('\n'.join(get_data(output_dict)))
Upvotes: 1