urllib.error.URLError:

Question

from urllib.request import urlopen
from bs4 import BeautifulSoup
import datetime
import random
import re

random.seed(datetime.datetime.now())

def getLinks(articleUrl):
    html = urlopen("http://en.wikipedia.org"+articleUrl)
    bsObj = BeautifulSoup(html)
    return bsObj.find("div", {"id":"bodyContent"}).findAll("a",href = re.compile("^(/wiki/)((?!:).)*$"))

getLinks('http://en.wikipedia.org')

OS is Linux. The above script spits out a "urllib.error.URLError: ". Looked through a number of attempts to solve this that I found on google, but none of them fixed my problem (attempted solutions include changing the env variable and adding nameserver 8.8.8.8 to my resolv.conf file).

Ozgur Vatansever · Accepted Answer

You should call getLinks() with a valid url:

>>> getLinks('/wiki/Main_Page')

Besides, in your function, you should also call .read() to get the response content before passing it to BeautifulSoup:

>>> html = urlopen("http://en.wikipedia.org" + articleUrl).read()

urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>

Answers (1)

Related Questions

urllib.error.URLError: &lt;urlopen error [Errno -2] Name or service not known&gt;

Answers (1)

Related Questions

urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>