Reputation: 867
I have a one line file that I want to read word by word, i.e., with space separating words. Is there a way to do this without loading the data into the memory and using split? The file is too large.
Upvotes: 3
Views: 399
Reputation: 123
Try this little function:
def readword(file):
c = ''
word = ''
while c != ' ' and c != '\n':
word += c
c = file.read(1)
return word
Then to use it, you can do something like:
f = open('file.ext', 'r')
print(readword(f))
This will read the first word in the file, so if your file is like this:
12 22 word x yy
another word
...
then the output should be 12
.
Next time you call this function, it will read the next word, and so on...
Upvotes: 0
Reputation: 1212
You can read the file char by char and yield a word after each new white space, below is a simple solution for a file with single white spaces, you should refine it for complex cases (tabs, multiple spaces, etc).
def read_words(filename):
with open(filename) as f:
out = ''
while True:
c = f.read(1)
if not c:
break
elif c == ' ':
yield out
out = ''
else:
out += c
Example:
for i in read_words("test"):
print i
It uses a generator to avoid have to allocate a big chunk of memory
Upvotes: 2