yoshiserry
yoshiserry

Reputation: 21365

python read each line in a text file and put anything between START and END in a new file

I have a text file i've imported which has no blank lines and looks like this... Each of these things are on a separate line.

--START--some data
one line
two line
three line
--END--
four
five
--START-- some data
six 
seven
eight
--END--
nine 
ten
eleven
--START-- some data

What I want

I have already written code to open the file and loop through each line and find the ones which contain start.

import codecs
file = codecs.open('data.txt', encoding='utf-8').read()
for line in file:

    if '--START--' in line:
    #found the start line (keep all lines until you find END)

What I don't know how to do is create the logic in python where each line that either begins with START or is after that (until but not including the END line) goes into a new text file.

So I would end up with NewFile.txt which contained only:

--START--some data
one line
two line
three line
--START-- some data
six 
seven
eight
--START-- some data

Upvotes: 0

Views: 111

Answers (3)

Joran Beasley
Joran Beasley

Reputation: 114038

you mean something like

file_contents = open('data.txt',"rb").read()
with open("newfile.txt","wb") as f:
      f.write("--START--".join(p.split("--END--")[0] for p in file_contents.split("--START--")))

Upvotes: 1

Padraic Cunningham
Padraic Cunningham

Reputation: 180482

from  itertools import takewhile
with open("in.txt") as f:
    final = []
    for line in f:
        if line.startswith("--START--"):
            final += [line] + list(takewhile(lambda x: not x.startswith("--END--"),f))
print(final)
['--START--some data\n', 'one line\n', 'two line\n', 'three line\n', '--START-- some data\n', 'six \n', 'seven\n', 'eight\n', '--START-- some data']

To write the new data:

from  itertools import takewhile
with open("in.txt") as f,open("out.txt","w") as f1:
    for line in f:
        if line.startswith("--START--"):
            f1.write(line + "".join(list(takewhile(lambda x: not x.startswith("--END--"),f))))

Upvotes: 0

Jose Varez
Jose Varez

Reputation: 2077

What about this?

import codecs
file = codecs.open('data.txt', encoding='utf-8').read()
startblock = 0
for line in file:
    if '--END--' in line:
        startblock = 0
    elif '--START--' in line or startblock:
        # Write to file
        startblock = 1

Upvotes: 0

Related Questions