Reputation: 137
I have two large text files of data from one experiment and I want split it into one in special way.
Small sample of data:
file1:
plotA 10
plotB 9
plotC 9
file2:
98%
7/10
21
98%
5/10
20
98%
10/10
21
And I would like result like this:
plotA 10 98% 7/10 21
plotB 9 98% 5/10 20
plotC 9 98% 10/10 21
I have no idea how it solve in python. I tried to reorder file2 with:
lines = file2.readlines()
aaa = lines[0] + lines[3] + lines[6]
bbb = lines[1] + lines[4] + lines[7]
ccc = lines[2] + lines[5] + lines[8]
and use zip but I failed (and this method is time consuming for large text files).
Any help?
Upvotes: 1
Views: 112
Reputation: 107347
You can use itertools.izip_longest
to slice file 2 to triple lines and use again use it to zip them with first file :
from itertools import izip_longest
with open('file1.txt') as f1, open('file2.txt') as f2:
args = [iter(f2)] * 3
z = izip_longest(f1, izip_longest(*args), fillvalue='-')
for line, tup in z:
print '{:11}'.format(line.strip()), '{:5}{:5}{:>5}'.format(*map(str.strip, tup))
And if you want to write this result to a new file you can open a file for write and instead of printing it write the line in file.
Result :
plotA 10 98% 7/10 21
plotB 9 98% 5/10 20
plotC 9 98% 10/10 21
Upvotes: 5
Reputation: 3420
Here is an example, you'll have to improve it with error handling and all :^)
file1 = open('file1')
file2 = open('file2')
# take one line in file1
for line in file1:
# print result with tabulation to separate fields
print '\t'.join(
# the line from file1
[line.strip()] +
# and three lines from file2
[file2.readline().strip() for _ in '123']
)
Note that I'm using the string '123'
because it is shorter than range(3)
(and it does not require a function call); it just have to be an iterable of any sort generating three steps.
Reading only the required data and processing them avoid the need to load all files in memory (as you said your files are large).
Cheers.
Upvotes: 1