rafasalo
rafasalo

Reputation: 255

How to use read next() starting from any line in python?

I'm trying to start reading some file from line 3, but I can't.

I've tried to use readlines() + the index number of the line, as seen bellow:

x = 2
f = open('urls.txt', "r+").readlines( )[x]
line = next(f)
print(line)

but I get this result:

Traceback (most recent call last):
  File "test.py", line 441, in <module>
    line = next(f)
TypeError: 'str' object is not an iterator

I would like to be able to set any line, as a variable, and from there, all the time that I use next() it goes to the next line.

IMPORTANT: as this is a new feature and all my code already uses next(f), the solution needs to be able to work with it.

Upvotes: 1

Views: 682

Answers (5)

ron rothman
ron rothman

Reputation: 18127

Just call next(f) as many times as you need to. (There's no need to overcomplicate this with itertools, nor to slurp the entire file with readlines.)

lines_to_skip = 3

with open('urls.txt') as f:
    for _ in range(lines_to_skip):
        next(f)

    for line in f:
        print(line.strip())

Output:

% cat urls.txt
url1
url2
url3
url4
url5

% python3 test.py
url4
url5

Upvotes: 0

Mad Physicist
Mad Physicist

Reputation: 114230

The line you printed returns a string:

open('urls.txt', "r+").readlines()[x]

open returns a file object. Its readlines method returns a list of strings. Indexing with [x] returns the third line in the file as a single string.

The first problem is that you open the file without closing it. The second is that your index doesn't specify a range of lines until the end. Here's an incremental improvement:

with open('urls.txt', 'r+') as f:
    lines = f.readlines()[x:]

Now lines is a list of all the lines you want. But you first read the whole file into memory, then discarded the first two lines. Also, a list is an iterable, not an iterator, so to use next on it effectively, you'd need to take an extra step:

lines = iter(lines)

If you want to harness the fact that the file is already a rather efficient iterator, apply next to it as many times as you need to discard unwanted lines:

with open('urls.txt', 'r+') as f:
    for _ in range(x):
        next(f)
    # now use the file
    print(next(f))

After the for loop, any read operation you do on the file will start from the third line, whether it be next(f), f.readline(), etc.

There are a few other ways to strip the first lines. In all cases, including the example above, next(f) can be replaced with f.readline():

for n, _ in enumerate(f):
    if n == x:
        break

or

for _ in zip(f, range(x)): pass

After you run either of these loops, next(f) will return the xth line.

Upvotes: 0

Aditya Satyavada
Aditya Satyavada

Reputation: 1058

The following code will allow you to use an iterator to print the first line:

In [1]: path = '<path to text file>'                                                           

In [2]: f = open(path, "r+")                                                    

In [3]: line = next(f)

In [4]: print(line)

This code will allow you to print the lines starting from the xth line:

In [1]: path = '<path to text file>'

In [2]: x = 2

In [3]: f = iter(open(path, "r+").readlines()[x:])

In [4]: f = iter(f)                                                             

In [5]: line = next(f)

In [6]: print(line)

Edit: Edited the solution based on @Tomothy32's observation.

Upvotes: 0

iz_
iz_

Reputation: 16573

Try this (uses itertools.islice):

from itertools import islice

f = open('urls.txt', 'r+')
start_at = 3
file_iterator = islice(f, start_at - 1, None)

# to demonstrate
while True:
    try:
        print(next(file_iterator), end='')
    except StopIteration:
        print('End of file!')
        break

f.close()

urls.txt:

1
2
3
4
5

Output:

3
4
5
End of file!

This solution is better than readlines because it doesn't load the entire file into memory and only loads parts of it when needed. It also doesn't waste time iterating previous lines when islice can do that, making it much faster than @MadPhysicist's answer.

Also, consider using the with syntax to guarantee the file gets closed:

with open('urls.txt', 'r+') as f:
    # do whatever

Upvotes: 3

Draconis
Draconis

Reputation: 3461

The readlines method returns a list of strings for the lines. So when you take readlines()[2] you're getting the third line, as a string. Calling next on that string then makes no sense, so you get an error.

The easiest way to do this is to slice the list: readlines()[x:] gives a list of everything from line x onwards. Then you can use that list however you like.

If you have your heart set on an iterator, you can turn a list (or pretty much anything) into an iterator with the iter builtin function. Then you can next it to your heart's content.

Upvotes: 0

Related Questions