Reputation: 1449
Trying to make this piece of code work : ( web scraping sample using BeautifulSoup )
import urllib2
wiki = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"
page = urllib2.urlopen(wiki)
from bs4 import BeautifulSoup
soup = BeautifulSoup(page)
I get this error :-
URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>
I guess it is to do with some firewall/security related issue, can someone help with what should be done?
Upvotes: 1
Views: 1232
Reputation: 17074
You can try something like this with requests
:
import requests
from bs4 import BeautifulSoup
wiki = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"
page = requests.get(wiki).content
soup = BeautifulSoup(page)
If you are trying to get the table, you can use pandas like this:
import pandas as pd
wiki = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"
df = pd.read_html(wiki)[1]
df2 = df.copy()
df2.columns = df.iloc[0]
df2.drop(0, inplace=True)
df2.drop('No.', axis=1, inplace=True)
df2.head()
Output:
Upvotes: 1