Adesua Martins
Adesua Martins

Reputation: 65

How can I scrape a row from this table

so I have been trying to scrape the electoral votes of all the presidents that have won the US presidential election from the large table on this page.

Here is the code I have been trying to use:

from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import pandas

# using selenium and shromedriver to extract the javascript wikipage

scrape_options = Options()
scrape_options.add_argument('--headless')
driver = webdriver.Chrome(r'web scraping master/chromedriver', options=scrape_options)
page_info = driver.get('https://en.wikipedia.org/wiki/United_States_presidential_election')

# waiting for the javascript to load



try:WebDriverWait(driver,10).until(EC.presence_of_element_located((By.CSS_SELECTOR,".wikitable.sortable.jquery-tablesorter")))
finally:
    page = driver.page_source
    soup = BeautifulSoup(page, 'lxml')
table = soup.find('table', {'class': 'wikitable sortable 
jquerytablesorter'})
    #print(table)

rows=table.find_all('tr')

The code works find up to this point. here is the part of the code that is supposed to get the information i need.

for row in rows:
need=row.find_all('td')


for n in need:
    
    
    try:
        if len(n.find('b')==0):
            continue
        else:
            if nek.find('b').find('sup'):
            continue
            electoral_votes=n.find('span',{'style':"position: relative margin: 0 
0.3em;"}).get_text()
                print(electoral_votes)
    except:continue

After running this part of the code, the code does not return anything i need.

can someone help me out?

I'd be so greatfull

Upvotes: 0

Views: 157

Answers (2)

QHarr
QHarr

Reputation: 84465

trying to scrape the electoral votes of all the presidents that have won the US presidential election

As you want all the presidential candidates who became presidents (we will throw in Joe Biden though he is president elect at time of writing 28/11/2020; you can easily remove), I chose a method which loops the table rows.

The table rows are deliberately restricted by a particular css selector to compensate for the table being irregular, and to pick up only the bold winners in the presidential candidate column. I chose this level so I can go on to select the various child elements to populate my output; in the format {year:[winner, vote],.....}.

I use an attribute selector, with contains (*) operator, to target the year of interest by the title attribute containing the string 'United States presidential election' ; I use a further css selector to get the winner (who has bold highlighting); I use regex to pull out the votes from the text of the tr element.


Py

from bs4 import BeautifulSoup as bs
import requests,re 

soup = bs(requests.get('https://en.wikipedia.org/wiki/United_States_presidential_election').text, 'lxml')
presidential_wins_by_year = {
      int(i.select_one('[title*="United States presidential election"]').text):  #year
      [i.select_one('td[rowspan] ~ td:nth-of-type(3) b a').text.strip(), # winner candidate
       re.search('(\d+\s?\/\s?\d+)', i.text).groups(0)[0] #votes
      ]
  for i in soup.select('.sortable tr:has(td[rowspan] ~ td:nth-of-type(3) b a)')
}
print(presidential_wins_by_year)

Sample output:

enter image description here

Upvotes: 1

chitown88
chitown88

Reputation: 28565

You can just use pandas to read in the html. This will return all the tables into a list. It's just a matter of pulling out the table you're interested in:

Code:

import pandas as pd

url = 'https://en.wikipedia.org/wiki/United_States_presidential_election'

dfs = pd.read_html(url)

Output:

print(dfs[2].head(20).to_string())

    Year                  Party    Presidential candidate Vice presidential candidate Popular vote      % Electoral votes Notes
0   1788            Independent         George Washington                None[note 3]        43782  100.0        69 / 138   NaN
1   1788             Federalist        John Adams[note 4]                None[note 3]          NaN    NaN        34 / 138   NaN
2   1788             Federalist                  John Jay                None[note 3]          NaN    NaN         9 / 138   NaN
3   1788             Federalist        Robert H. Harrison                None[note 3]          NaN    NaN         6 / 138   NaN
4   1788             Federalist             John Rutledge                None[note 3]          NaN    NaN         6 / 138   NaN
5   1788             Federalist              John Hancock                None[note 3]          NaN    NaN         4 / 138   NaN
6   1788    Anti-Administration            George Clinton                None[note 3]          NaN    NaN         3 / 138   NaN
7   1788             Federalist         Samuel Huntington                None[note 3]          NaN    NaN         2 / 138   NaN
8   1788             Federalist               John Milton                None[note 3]          NaN    NaN         2 / 138   NaN
9   1788             Federalist           James Armstrong                None[note 3]          NaN    NaN         1 / 138   NaN
10  1788             Federalist          Benjamin Lincoln                None[note 3]          NaN    NaN         1 / 138   NaN
11  1788    Anti-Administration            Edward Telfair                None[note 3]          NaN    NaN         1 / 138   NaN
12  1792            Independent         George Washington                None[note 3]        28579  100.0       132 / 264   NaN
13  1792             Federalist        John Adams[note 4]                None[note 3]          NaN    NaN        77 / 264   NaN
14  1792  Democratic-Republican            George Clinton                None[note 3]          NaN    NaN        50 / 264   NaN
15  1792  Democratic-Republican          Thomas Jefferson                None[note 3]          NaN    NaN         4 / 264   NaN
16  1792  Democratic-Republican                Aaron Burr                None[note 3]          NaN    NaN         1 / 264   NaN
17  1796             Federalist                John Adams                None[note 3]        35726   53.4        71 / 276   NaN
18  1796  Democratic-Republican  Thomas Jefferson[note 5]                None[note 3]        31115   46.6        68 / 276   NaN
19  1796             Federalist           Thomas Pinckney                None[note 3]          NaN    NaN        59 / 276   NaN

Upvotes: 1

Related Questions