Reputation: 479
I have a file called list.txt which looks like so:
input1
input2
input3
I am certain there is no blank line after the last line (input3). I then have a python script which will read this file line by line and write the text into some more text to create 3 files (one for each line):
import os
os.chdir("/Users/user/Desktop/Folder")
with open('list.txt','r') as f:
lines = f.read().split('\n')
#for l in lines:
header = "#!/bin/bash \n#BSUB -J %s.sh \n#BSUB -o /scratch/DBC/user/%s.sh.out \n#BSUB -e /scratch/DBC/user/%s.sh.err \n#BSUB -n 1 \n#BSUB -q normal \n#BSUB -P DBCDOBZAK \n#BSUB -W 168:00\n"%(l,l,l)
script = "cd /scratch/DBC/user\n"
script2 = 'grep "input" %s > result.%s.txt\n'%(l,l)
all= "\n".join([header,script,script2])
with open('script_{}.sh'.format(l), 'w') as output:
output.write(all)
My problem is, this creates 4 files, not 3: script_input1.sh, script_input.sh, script_input3.sh and script_.sh. This last file has no text where the others would have input1 or input2 or input3.
It seems that Python reads my list.txt line by line, but when it reaches "input3", it somehow continues? How can I tell Python to read my file line by line, separated by "\n" but stop after the last text?
Upvotes: 1
Views: 2345
Reputation: 77902
First, don't read the whole file into memory when you don't have too - files are iterable so the proper way to read a file line by line is:
with open("/path/to/file.ext") as f:
for line in f:
do_something_with(line)
Now in your for loop, you just have to strip the line and, if it's empty, ignore it:
with open("/path/to/file.ext") as f:
for line in f:
line = line.strip()
if not line:
continue
do_something_with(line)
Slightly unrelated but Python has multiline strings, so you don't need concatenation either:
# not sure I got it right actually ;)
script_tpl = """
#!/bin/bash
#BSUB -J {line}.sh
#BSUB -o /scratch/DBC/user/{line}.sh.out
#BSUB -e /scratch/DBC/user/{line}.sh.err
#BSUB -n 1
#BSUB -q normal
#BSUB -P DBCDOBZAK
#BSUB -W 168:00
cd /scratch/DBC/user
grep "input" {line} > result.{line}.txt
"""
with open("/path/to/file.ext") as f:
for line in f:
line = line.strip()
if not line:
continue
script = script_tpl.format(line=line)
with open('script_{}.sh'.format(line), 'w') as output:
output.write(script)
As a last note: avoid changing dir in your script, use os.path.join()
instead to work with absolute paths.
Upvotes: 3
Reputation: 121
Have you considered using readlines() instead of read()? That will let Python handle the question for you of whether or not the last line has a \n or not.
Bear in mind that if the input file does have a \n on the final line, then using read() and splitting by '\n' will create an extra value. For example:
my_string = 'one\ntwo\nthree\n'
my_list = my_string.split('\n')
print my_list
# >> ['one', 'two', 'three', '']
potential solution
lines = f.readlines()
# remove newlines
lines = [line.strip() for line in lines]
# remove any empty values, just in case
lines = filter(bool, lines)
For a simple example, see here: How do I read a file line-by-line into a list?
Upvotes: 1
Reputation: 530930
f.read()
returns a string that ends with a newline, which split
dutifully treats as separating the last line from an empty string. It's not clear why you are reading the entire file into memory explicitly; just iterate over the file object and let it deal with line-splitting.
with open('list.txt','r') as f:
for l in f:
# ...
Upvotes: 1
Reputation: 1227
I think you are using split wrong.
If you have the following:
text = 'xxx yyy'
text.split(' ') # or simply text.split()
The result will be
['xxx', 'yyy']
Now if you have:
text = 'xxx yyy ' # extra space at the end
text.split()
The result will be
['xxx', 'yyy', '']
, because split gets what is before and after each ' ' (space). In this case there is empty string after the last space.
Some functions you might use:
strip([chars]) # This removes all chars at the beggining or end of a string
Example:
text = '___text_about_something___'
text.strip('_')
The result will be:
'text_about_something'
In your particular question, you can simply:
lines = f.readlines() # read all lines of the file without '\n'
for l in lines:
l.strip(' ') # remove extra spaces at the start or end of line if you need
Upvotes: 0
Reputation: 6969
Using your current approach, you'll want to:
lines
is empty (lines[-1] == ''
)lines = lines[:-1]
).with open('list.txt','r') as f:
lines = f.read().split('\n')
if lines[-1] == '':
lines = lines[:-1]
for line in lines:
print(line)
Don't forget that it's legal for a file to not end in a newline (with a blank line at the end)... this will handle that scenario.
Alternatively, as @setsquare pointed out, you might want to try using readlines()
:
with open('list.txt', 'r') as f:
lines = [ line.rstrip('\n') for line in f.readlines() ]
for line in lines:
print(line)
Upvotes: 1