yonks
yonks

Reputation: 11

extracting data using beautifulsoup from wiki

I pretty new to this,

enter image description here

What I am trying to accomplished is having a table with distrcits and their various neighborhoods but my final code just list all neighborhoods in a list format without assigning them to a specific district.

url = "https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Toronto"
html = urlopen(url) 
soup = BeautifulSoup(html, 'lxml')
type(soup)
print(soup.prettify())
Toronto_table = soup.find('table',{'class':'wikitable sortable'})
links = Toronto_table.find_all('a')
neighborhoods = []
for link in links:
    neighborhoods.append(link.get('title'))
    print(neighborhoods)
df_neighborhoods = pd.DataFrame(neighborhoods)
df_neighborhoods

Upvotes: 1

Views: 57

Answers (1)

KunduK
KunduK

Reputation: 33384

You can simply read_html and print the table.

import pandas as pd
f_states=pd.read_html('https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Toronto')
print(f_states[6])

Output :

 District Number                            Neighbourhoods Included
0              C01  Downtown, Harbourfront, Little Italy, Little P...
1              C02  The Annex, Yorkville, South Hill, Summerhill, ...
2              C03  Forest Hill South, Oakwood–Vaughan, Humewood–C...
3              C04  Bedford Park, Lawrence Manor, North Toronto, F...
4              C06           North York, Clanton Park, Bathurst Manor
5              C07  Willowdale, Newtonbrook West, Westminster–Bran...
6              C08  Cabbagetown, St. Lawrence Market, Toronto wate...
7              C09                               Moore Park, Rosedale
8              C10  Davisville Village, Midtown Toronto, Lawrence ...
9              C11         Leaside, Thorncliffe Park, Flemingdon Park
10             C13     Don Mills, Parkwoods–Donalda, Victoria Village
11             C14                  Newtonbrook East, Willowdale East
12             C15  Hillcrest Village, Bayview Woods – Steeles, Ba...
13             E01       Riverdale, Danforth (Greektown), Leslieville
14             E02                     The Beaches, Woodbine Corridor
15             E03  Danforth (Greektown), East York, Playter Estat...
16             E04  The Golden Mile, Dorset Park, Wexford, Maryval...
17             E05      Steeles, L'Amoreaux, Tam O'Shanter – Sullivan
18             E06        Birch Cliff, Oakridge, Hunt Club, Cliffside
19             E08  Scarborough Village, Cliffcrest, Guildwood, Eg...
20             E09  Scarborough City Centre, Woburn, Morningside, ...
21             E10  Rouge (South), Port Union (Centennial Scarboro...
22             E11                              Rouge (West), Malvern
23             W01  High Park, South Parkdale, Swansea, Roncesvall...
24             W02  Bloor West Village, Baby Point, The Junction (...
25             W03  Keelesdale, Eglinton West, Rockcliffe–Smythe, ...
26             W04  York, Glen Park, Amesbury (Brookhaven), Pelmo ...
27             W05  Downsview, Humber Summit, Humbermede (Emery), ...
28             W06        New Toronto, Long Branch, Mimico, Alderwood
29             W07              Sunnylea (The Queensway – Humber Bay)
30             W08  The Kingsway, Central Etobicoke, Eringate – Ce...
31             W09  Kingsview Village-The Westway, Richview (Willo...
32             W10  Rexdale, Clairville, Thistletown - Beaumond He...

Upvotes: 1

Related Questions