Reputation: 31739
is there any way to remove what found between two lines that contain two concrete strings?
I mean: I want to remove anything found between 'heaven' and 'hell' in a text file with this text:
I'm in heaven
foobar
I'm in hell
After executing the script/function I'm asking the text file will be empty.
Upvotes: 0
Views: 3463
Reputation: 223062
Use a flag to indicate whether you're writing or not.
from __future__ import with_statement
writing = True
with open('myfile.txt') as f:
with open('output.txt') as out:
for line in f:
if writing:
if "heaven" in line:
writing = False
else:
out.write(line)
elif "hell" in line:
writing = True
os.remove('myfile.txt')
os.rename('output.txt', 'myfile.txt')
EDIT
As extraneon pointed in the comments, the requirement is to remove the lines between two concrete strings. That means that if the second (closing) string is never found, nothing should be removed. That can be achieved by keeping a buffer of lines. The buffer gets discarded if the closing string "I'm in hell"
is found, but if the end of file is reached without finding it, the whole contents must be written to the file.
Example:
I'm in heaven
foo
bar
Should keep the whole contents since there's no closing tag and the question says between two lines.
Here's an example to do that, for completion:
from __future__ import with_statement
writing = True
with open('myfile.txt') as f:
with open('output.txt') as out:
for line in f:
if writing:
if "heaven" in line:
writing = False
buffer = [line]
else:
out.write(line)
elif "hell" in line:
writing = True
else:
buffer.append(line)
else:
if not writing:
#There wasn't a closing "I'm in hell", so write buffer contents
out.writelines(buffer)
os.remove('myfile.txt')
os.rename('output.txt', 'myfile.txt')
Upvotes: 3
Reputation: 31739
see below. I dont know if it's ok but It seems is working ok.
import re,fileinput,os
for path, dirs, files in os.walk(path):
for filename in files:
fullpath = os.path.join(path, filename)
f = open(fullpath,'r')
data = f.read()
patter = re.compile('Im in heaven.*?Im in hell', re.I | re.S)
data = patter.sub("", data)
f.close()
f = open(fullpath, 'w')
f.write(data)
f.close()
Anyway when i execute it, it leaves a blank line. I mean, if have this function:
public function preFetchAll(Doctrine_Event $event){
//Im in heaven
$a = sfContext::getInstance()->getUser()->getAttribute("passw.formulario");
var_dump($a);
//Im in hell
foreach ($this->_listeners as $listener) {
$listener->preFetchAll($event);
}
}
and i execute my script, i get this:
public function preFetchAll(Doctrine_Event $event){
foreach ($this->_listeners as $listener) {
$listener->preFetchAll($event);
}
}
As you can see there is an empty line between "public..." and "foreach...".
Why?
Javi
Upvotes: -1
Reputation: 882231
Looks like by "remove" you mean "rewrite the input file in-place" (or make it look like you're so doing;-), in which case fileinput.input helps:
import fileinput
writing = True
for line in fileinput.input(['thefile.txt'], inplace=True):
if writing:
if 'heaven' in line: writing = False
else: print line,
else:
if 'hell' in line: writing = True
Upvotes: 1
Reputation: 5072
You could do something like the following with regular expressions. There are probably more efficient ways to do it since I'm still learning a lot of python, but this should work.
import re
f = open('hh_remove.txt')
lines = f.readlines()
pattern1 = re.compile("heaven",re.I)
pattern2 = re.compile("hell",re.I)
mark1 = False
mark2 = False
for i, line in enumerate(lines):
if pattern1.search(line) != None:
mark1 = True
set1 = i
if pattern2.search(line) != None:
mark2 = True
set2 = i+1
if ((mark1 == True) and (mark2 == True)):
del lines[set1:set2]
mark1 = False
mark2 = False
f.close()
out = open('hh_remove.txt','w')
out.write("".join(lines))
out.close()
Upvotes: 0
Reputation: 11167
I apologize but this sounds like a homework problem. We have a policy on these: https://meta.stackexchange.com/questions/10811/homework-on-stackoverflow
However, what I can say is that the feature @nosklo wrote about is available in any Python 2.5.x (or newer), but you need to learn enough Python to enable it. :-)
My solution would involve using creating a new string with the undesired stuff stripped out using str.find()
or str.index()
(or some relative of those 2).
Best of luck!
Upvotes: -1