Reputation: 447
For example I've this txt files containing these lines,
chicken
chicken
duck
duck
duck
parrot
parrot
chicken
chicken
chicken
How can I read it line by line and split chicken (2 lines) to 1.txt, duck (3 lines) to 2.txt and parrot (2 lines) to 3.txt and the last chicken (3 lines) occurrence to a 4.txt?
I've figured out until here,
count = 0
with open("test.txt") as rl:
for num, line in enumerate (rl, 1):
s = list(line)
if "chicken" in line:
count += 1
finaljoin = "".join(s)
print(count)
with open("chicken.txt", 'a+') as f:
f.write(finaljoin)
But my solution above only grab the whole chicken (total 5) into one file. The actual plan was to grab the 1st two line to a txt file and the last two chicken line to another txt file. Because it is being split by another animals.
Upvotes: 0
Views: 60
Reputation: 18906
You can do it like this:
from itertools import groupby
with open('test.txt') as f:
data = f.read().split('\n')
for ind, (_, g) in enumerate(groupby(data),1):
with open('{}.txt'.format(ind), 'w') as f:
f.write('\n'.join(g))
Explanation:
You can read about Itertools groupby here: https://docs.python.org/2/library/itertools.html#itertools.groupby.
Groupby will return two elements, the key and the group.
So if we want to loop through a groupby we would do something like this: for key, group in groupby(object):
or for k, g in groupby(object):
Now in this case the keys will be chicken, duck, parrot, chicken
and the groups will be ['chicken', 'chicken'] , ['duck','duck... ...]
However (now comes the part where I explain ind, (_, g)
), to obtain an index as we loop we can use Python's enumerate function which will return an index and the iterator. Typically it looks like this: for index, item in enumerate(list):
or for ind, i in enumerate(list)
.
Now let's say we want to combine enumerate
and groupby
. Then we could do it like this: for index, (key, group) in enumerate(groupby(object)):
or more compact: for ind, (_, g) ...
. I use _
in this case (and this is Pythonic) to signal that I am not interested in the variable (the key in this case).
Upvotes: 1
Reputation: 909
You can try:
count = 0
with open("test.txt") as readFile:
previous_line = ""
archive_name = ""
for line in readFile:
if line != previous_line:
previous_line = line
count += 1
archive_name = str(count)+".txt"
with open(archive_name, 'a+') as f:
f.write(line)
That will save "chicken chicken" in 1.txt, "duck duck duck" in 2.txt, "parrot parrot" in 3.txt and "chicken chicken chicken" in 4.txt
Upvotes: 1
Reputation: 77837
Actually, you haven't figured it out. You have no splitting provision; all you've done is to search for "chicken", wherever it appears, and dump those reconstituted lines into a "chicken.txt" file. You've made no provision for any other animal, and there's no attempt at logic to find those breaks. Also, there's a lot of superfluous code in this, such as repeatedly opening your output file, and generating num
, which is never used.
Draw out your basic logic on paper, if needed. The critical step that you're missing is to check the previous animal against the current one. This is something such as
previous = None
with open("test.txt") as zoo:
for animal in zoo:
if animal == previous:
# Process same animal
else:
# Process new animal
previous = animal # remember animal for next iteration
Can you take it from there? for num, line in enumerate (rl, 1):
Upvotes: 0