Roy
Roy

Reputation: 867

Reading a file by word without using split in Python

I have a one line file that I want to read word by word, i.e., with space separating words. Is there a way to do this without loading the data into the memory and using split? The file is too large.

Upvotes: 3

Views: 399

Answers (2)

Tooniis
Tooniis

Reputation: 123

Try this little function:

def readword(file):
c = ''
word = ''
while c != ' ' and c != '\n':
    word += c
    c = file.read(1)
return word

Then to use it, you can do something like:

f = open('file.ext', 'r')
print(readword(f))

This will read the first word in the file, so if your file is like this:

12 22 word x yy
another word
...

then the output should be 12.

Next time you call this function, it will read the next word, and so on...

Upvotes: 0

dlavila
dlavila

Reputation: 1212

You can read the file char by char and yield a word after each new white space, below is a simple solution for a file with single white spaces, you should refine it for complex cases (tabs, multiple spaces, etc).

def read_words(filename):
    with open(filename) as f:
        out = ''
        while True:
            c = f.read(1)
            if not c:
                break
            elif c == ' ':
                yield out
                out = ''
            else:
                out += c

Example:

for i in read_words("test"):
    print i 

It uses a generator to avoid have to allocate a big chunk of memory

Upvotes: 2

Related Questions