Reputation: 359
This is my first question here!
I have a file contains hundreds of lines such as:
<car> <part_of> <machine>
<motor> <part_of> <car>
Each line represents subject, relation, object
I want to read each 2 lines separately, process them, then output 2 or 3 lines based on the input file. I did something like this:
opener = open('input.txt') # to read even lines
opener2 = open('input.txt') # to read odd lines
num = 2
for eachline in opener:
if num % 2 == 0:
line1 = opener.readline().split()
sub_line1, rel_line1, obj_line1 = line1[0],line1[1],line1[2]
sub_line1 = line1[0].lstrip("<").rstrip(">")
rel_line1 = line1[1].lstrip("<").rstrip(">")
obj_line1 = line1[2].lstrip("\"").rstrip("\"")
else:
line2 = opener2.readline().split()
sub_line2, rel_line2, obj_line2 = line1[0],line2[1],line2[2]
sub_line2 = line2[0].lstrip("<").rstrip(">")
rel_line2 = line2[1].lstrip("<").rstrip(">")
obj_line2 = line2[2].lstrip("\"").rstrip("\"")
num += 1
And I did this for the output:
output1 = " ".join([sub_line1,rel_line1,obj_line1])
writer.write(output1+"\n")
output2 = " ".join([sub_line2,rel_line2,obj_line2])
writer.write(output2+'\n')
output3 = " ".join([sub_line1,relation,sub_line2])
writer.write(output3+'\n')
Note: output3 does not exist in the input. I am adding this by combining the previous 2 lines. But every time the odd lines rewritten by the even one. How can I separate them ?
sample output:
<car> <part_of> <machine>
<motor> <part_of> <car>
<car> <part_of> <motor>
each 3rd line is composed based on the previous 2 lines.
PART 2:
If there is a line in the input file, starts with "_" how can I just output it as it is without including it as one of the 2 lines I am processing? where can I put this condition?
Thanks in advance!!
Upvotes: 3
Views: 1588
Reputation: 16740
You cannot get a file descriptor (the stuff returned by open
) to read only odd or even lines.
It has to read the whole content of the file1.
However and therefore, you do not need to have two file descriptors: you can do the job with only one.
You can iterate over enumerate(file)
instead of file
.
Instead of giving you the lines, it will give you (index, line)
couples.
You can unpack this by doing for id, line in enumerate(file)
, and then check the remainder of id
by 2
to determine if it's odd or even.
file = open(path, 'r')
for id, line in enumerate(file):
if id % 2 == 0:
# The line is even
else:
# The line is odd
1To be fair, you could get a file descriptor to read only odd or even lines, in that you could just skip every other line... But then, why bother to create two descriptors when a single one is already doing the job?
Upvotes: 2
Reputation: 532418
It's not clear why you need to separate iterators; just read two lines at a time:
with open('input.txt') as fh:
while True:
line1 = fh.readline()
line2 = fh.readline()
if not line1:
break
...
The condition under which you break can be modified to break if both line1
and line2
are empty, such as if the file has an odd number of lines.
However, if you do need the separate iterators for some reason, make each iterator take care of skipping every other line. Use the itertools
module to make this easy:
from itertools import tee, islice, izip
with open('input.txt') as fh:
# Get two copies of the iterator. IMPORTANT: don't use fh
# anymore; only itr1 and itr2
itr1, itr2 = tee(fh)
itr1 = islice(itr1, 0, None, 2) # 0, 2, 4, ...
itr2 = islice(itr2, 1, None, 2) # 1, 3, 5, ...
for line1, line2 in izip(itr1, itr2):
...
Upvotes: 1