Reputation: 21
I am new to Python; little experience in programming C++. I saw this question but it doesn't address my problem.
Python 2.7.9, 64-bit AMD, Windows 7 Ultimate, NTFS, administrator privileges & no "read only" attribute on file to be read.
I want to create a list of strings which fulfill a certain criteria, the strings are lines of the file(see notepad.cc/diniko93).So I wrote the following function-
def makeLineList( filePtr, ptr ):
lines = []
while True:
s = filePtr.readline()
if not s=="":
s = s[3:]
s = s.split()
if s[0].isdigit():
print("O")
lines.append(s)
elif s[0] in {"+", "-"}:
print("U")
lines.append(s)
else:
print("none")
break
filePtr.seek(ptr, 0); #I did this to restore file pointer, so other functions accessing this file later don't misbehave
return lines
and the 2 possible main()-like (pardon my ignorance of python) bodies that I am using are-
with open("./testStage1.txt", 'r') as osrc:
osrc.seek(291, 0)
L = makeLineList( osrc, osrc.tell())
print "".join(L)
and the other one-
osrc = open("./testStage1.txt", 'r')
osrc.seek(291, 0)
L = makeLineList( osrc, osrc.tell())
print "".join(L)
osrc.close()
both the times the output on terminal is a disappointing none
Please Note that the code above is minimum required to reproduce the problem and not the entire code.
EDIT:
Based on @avenet's suggestion, I googled & tried to use iter (__next__
obj.next()
in python 3.3+ or next(obj)
in 2.7) in my code but the problem persists, I am unable to read next line even if I call next(osrc)
from inside the function check out these 2 snippets
EDIT 2: I tried scope check inside my functions as if not osrc in locals():
and in next line with proper indent print("osrc not reachable")
. And the output is osrc not reachable
. I also tried using from tLib import transform_line
from a temporary tLib.py but with identical results. Why is osrc not available in either case?
EDIT 3: Since the problem appears to be of scope. So to avoid passing of file variable- make a function whose sole purpose is to read a line. The decision to get next line or not depends upon returned value of a function like isLineUseful()
def isLineUseful( text, lookFor ):
if text.find(lookFor)!=-1:
return 1
else:
return 0
def makeList( pos, lookFor ):
lines = []
with open("./testStage1.txt", 'r') as src:
src.seek(pos)
print(src.read(1))
while True:
line = next(src)
again = isLineUseful(line, lookFor)
if again==0:
src.seek(pos)
break
else:
lines.append(line)
return lines
t = makeList(84, "+")
print "\n".join(t)
Tried it, it works perfectly on this(notepad.cc/diniko93) sample testStage1.txt.
So my programming issue is solved (thanks to responders :D) & I am marking this as answered but posting a new question about the anomalous/ behavior of readline()
& __next__
.
P.S. I am still learning the ways of python so I would be very happy if you could suggest a more pythonic & idomatic version of my code above.
Upvotes: 0
Views: 4679
Reputation: 3043
If you want to modify the lines your way:
def transform_line(line):
if line != "":
if line[0].isdigit():
print("O")
elif line[0] in {"+", "-"}:
print("U")
else:
print("None")
return line
with open("./testStage1.txt", 'r') as osrc:
osrc.seek(291)
lines = [transform_line(line) for line in osrc]
#Do whatever you need with your line list
If you don't want to transform lines just do this:
with open("./testStage1.txt", 'r') as osrc:
osrc.seek(291)
lines = list(osrc)
#Do whatever you need with your line list
Or just implement a line iterator if you need to stop on a certain condition:
def line_iterator(file):
for line in file:
if not line[0].isdigit() and not line in ["+", "-"]:
yield line
else:
break
with open("./testStage1.txt", 'r') as osrc:
osrc.seek(291)
lines = list(line_iterator(osrc))
#To skip lines from the list containing 'blah'
lines = [x for x in lines if 'blah' not in line]
#Do whatever you need with your line list
Upvotes: 1
Reputation: 15837
First of all, you are not using Python as it should be used. The purpose of using a language like Python is to write just fewer lines of code to achieve the same result of other snippets of code in other programming languages, such as C++ or Java.
It's not necessary to pass a file pointer as a function parameter to read the file, you can open directly the file within the function to which you pass the filename.
Then you can call this function with the file name and store the list in a variable that you will eventually manipulate. If you are not familiar with exceptions handling, you could for example use a function from the module os
to check if the file already exists: os.path.exists(filename)
.
If you want to search for a pattern in the line you are currently using, you can simply use an if statement (there are a lot of ways of doing that, this is just an example):
if line not in list_of_strings_you_want_not_to_include:
lines.append(line)
If you to check if the pattern is at the beginning, you can use the startswith
string function on the line:
if not str(line).startswith("+"):
lines.append(line)
If you want to skip a certain amount of characters, you can use the seek
function (as you are effectively using). This is just a way that uses more lines of code, but it's still very simple:
def read_file(filename, _from):
lines = []
try:
with open(filename) as file:
file.seek(_from)
for line in file:
lines.append(line)
except FileNotFoundError:
print('file not found')
return lines
filename = "file.txt"
lines = read_file(filename, 10)
Much easier, you can also do this, instead of iterating explicitly through all lines:
with open(filename) as file:
file.seek(_from)
return list(file)
Or using your favourite function readlines
:
with open(filename) as file:
file.seek(_from)
return file.readlines()
The purpose and the advantage of iterating explicitly through all lines is that you can do a lot of checking and whatever you want with the lines or characters in the right moment you are reading, so I would adopt certainly the first option I suggested above.
Upvotes: 2
Reputation: 328594
You try to process this input:
<P> unnecessart line </P>
<P> Following is an example of list </P>
<P> 1. abc </P>
<P> + cba </P>
<P> + cba </P>
<P> + xyz </P>
Now in your brain, you just see the important bits but Python sees everything. For Python (and any other programming language), each line starts with <
. That's why the if
's never match.
If you stripped the <P>
, be sure to strip the spaces as well because
1. abc
+ cba
the second line starts with a space, so s[0]
isn't +
. To strip spaces, use s.trim()
.
Upvotes: 0