How to read one line with urllib.request

Question

I am trying to read one line of a web page with the urllib.request module.

I have tried readline(), readlines() and read() but I cannot make it read just one line.

How can I do this?

I am just trying to read the 581th line from python.org.

My script at the moment is:

import urllib.request

get_page = urllib.request.urlopen('https://www.python.org')
x = int('581')
get_ver = get_page.readline(x)

print("Currant Versions Are: ", get_ver)

And the result of this is:

Currant Versions Are:  b'
'

The result is always the same even if I change the number.

So how do I just read the 581th line?

ShmulikA · Accepted Answer

you are reading up to limit of 574 bytes and not the line 574.

that way you can get the n-th line number while trying to minimize the amount of data read from the server (check out http range request if you need better performance):

import urllib.request
from itertools import islice

get_page = urllib.request.urlopen('https://www.python.org')

def get_nth_line(resp, n):
    i = 1
    while i < n:
        resp.readline()
        i += 1
    return resp.readline()

print(get_nth_line(get_page, 574))

outputs:

b'Latest: Python 3.6.2 - Python 2.7.13
'

Suggestions

use requests for http requests instead of urllib

requests.get('http://www.python.org').read()

use regex or bs4 for parsing and extracting the version of python

Requests & Regex Example

import re, requests

resp = requests.get('http://www.python.org')
# regex might need adjustments
ver_regex = re.compile(r'(.*?)')
py2_ver = ver_regex.search(resp.text).group(1)
print(py2_ver)

outputs:

Python 2.7.13

How to read one line with urllib.request

Answers (2)

Suggestions

Requests & Regex Example

Related Questions