Renu sharma
Renu sharma

Reputation: 87

visible text on a webpage not coming through Python

This code seems to work for almost all webpages i want to scrape, but for this webpage :- https://www.usana.com/ux/dotcom/#!/enu-US/contact , it is giving just one line of text, whereas , on the webpage, i can see many addresses are given:-

options = webdriver.ChromeOptions()    # for cookies
options.add_argument(r"C:\Users\XXXXX\Selenium")  # this is the directory for the cookies
driver = webdriver.Chrome(r'C:\Users\XXXXX\XXXXXX\Documents\chromedriver.exe', options=options)
driver.set_page_load_timeout(100)
driver.get("https://www.usana.com/ux/dotcom/#!/enu-US/contact")

time.sleep(30)
try:
    click_alert=driver.switch_to.alert()      # to click on the pop up window
    click_alert.accept()
except:
    pass
res = requests.get(driver.current_url,headers = headers)
soup = BeautifulSoup(res.content, 'lxml')
txt = soup.text
print(txt)

I have tried to handle the cookies agreement message and the pop up window that appears on the page, but it still produces just the one line of output as below :-

USANA Health Sciences

I am seeking to have all the addresses on this page as text.

What need to be added or edited in the above code ? Any help is highly appreciated

Upvotes: 1

Views: 151

Answers (2)

Renu sharma
Renu sharma

Reputation: 87

I have also found another way to solve this :-

    from selenium.webdriver.support import expected_conditions as EC
    browser = webdriver.Chrome(r'C:\Users\XXXXXX\XXXX\Documents\chromedriver.exe')
    browser.get(url)
    browser.set_page_load_timeout(100)
    time.sleep(3)
            
    try:
        click_alert=browser.switch_to.alert()
        click_alert.accept()
        wait(browser,10).until(EC.element_to_be_clickable((By.XPATH,"//*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') , 'agree')]"))).click()
    except:
         pass

Upvotes: 0

Dan-Dev
Dan-Dev

Reputation: 9430

You need to use driver.page_source for BeautifulSoup. The reason being the URI fragment (everything after the # in the URL) is not sent to the server so you need a browser to render the page presumably using JavaScript (requests doesn't send it and doesn't execute JavaScript so the page doesn't render as expected).

Clients are not supposed to send URI fragments to servers https://en.wikipedia.org/wiki/URI_fragment

import time
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.common.exceptions import NoSuchElementException, ElementNotInteractableException

options = webdriver.ChromeOptions()
options.add_argument(r"C:\Users\XXXXX\Selenium") 
driver = webdriver.Chrome(r'C:\Users\XXXXX\XXXXXX\Documents\chromedriver.exe', options=options)
driver.set_page_load_timeout(100)
driver.get("https://www.usana.com/ux/dotcom/#!/enu-US/contact")

time.sleep(3)
try:
    driver.find_element_by_xpath("//*[contains(text(), 'OK')]").click()
except NoSuchElementException:
    print("Alert already accepted")
try:
    driver.find_element_by_class_name("optanon-allow-all").click()
except ElementNotInteractableException:
    print("Cookies already accepted")

soup = BeautifulSoup(driver.page_source, 'lxml')
headers = [x.get_text(separator=" ").strip() for x in soup.find_all('div', {'class': 'card-header'})]
bodies = [x.get_text(separator=" ").strip() for x in soup.find_all('div', {'class': 'card-body'})]
print(list(zip(headers, bodies)))
driver.quit()

Outputs:

[('USANA Hong Kong', '5/F, Sino Plaza  255-257 Gloucester Road  Causeway Bay, Hong Kong    Customer Services Hotline: (852) 2162 1888  Order Express: (852) 2162 1800     Office Hours   Customer service  Monday–Friday 12:00 p.m.–9:00 p.m. (HKT)  Saturday 11:00 a.m.–4:00 p.m.  The office is closed on Sundays and public holidays. [email protected]'), ('USANA Japan', 'USANA Health Sciences Japan LLC. Ichigaya MS Bldg 2F 4-1-9 Kudankita, Chiyoda-ku Tokyo, Japan 102-0073 Contact Information Phone: 03-5215-3050 (Out of Japan: +81-3-5215-3050) Fax: 03-5215-3052 (Out of Japan: +81-3-5215-3052) Office Hours Customer Service: Monday–Friday 10:00 a.m. to 1:00 p.m., 2:00 p.m.to 6:00 p.m. Office Counter: Monday–Friday 13:00 - 19:00 [email protected]'), ('USANA Health Sciences Korea, Ltd.', '5F SI Tower 203, Teheran-ro, Gangnam-gu Seoul, Korea 06141 Tel: +82-2-2192-7300 Fax:+82-2-2192-7399 Customer Phone Service Line Business Hour Mon - Fri / 9:00 AM - 6:00 PM Sat / 9:00 AM - 1:00 PM (Closed Sun & Holiday) Customer Will-Call Service Line Business Hour Mon - Fri 9:00AM - 6:00PM (Closed Sat-Sun & Holiday) [email protected]'), ('USANA Taiwan', '7F, No. 99, Fu-Hsin N. Road,  Taipei 105, Taiwan,  Republic of China    Tel:+886-2-7724-8000  Fax: +886-2-7724-1000  Customer Service line: 0809-085-588  Customer Service Fax: 0809-085-500     Office Hours  Monday–Friday 9:00 a.m. to 6:00 p.m. (CST)  Closed weekends and national holidays.     Customer Service and Will Call  Monday–Friday 12:00 noon to 9:00 p.m.  Closed weekends and national holidays. [email protected]'), ('USANA Australia', '3 Hudson Avenue Castle Hill NSW 2154, Australia Customer Service Phone: +612 9842 4600 Toll-free: 1800 687 872 Sydney Business Center Opening Hours Mon 8:30 a.m. to 5:00 p.m. Tue 8:30 a.m. to 5:00 p.m. Wed 8:30 a.m. to 5:00 p.m. Thu 9:00 a.m. to 5:00 p.m. Fri 8:30 a.m. to 5:00 p.m. Sat 9:30 a.m. to 3:00 p.m. Sundays and public holidays:Closed [email protected]'), ('USANA Malaysia', 'USANA Malaysia UHS Essential Health (Malaysia) Sdn. Bhd. Unit M2-2 & M2-5, Level M2, The Vertical Podium Avenue 3, Bangsar South No. 8 Jalan Kerinchi 59200 Kuala Lumpur   Telephone: 603-2246 0800 Facsimile: 603-2246 0901 Office Hours Monday–Friday 11:30 a.m. to 7:30 p.m. (MYT) Saturday 10:30 a.m. to 1:30 p.m. Sundays and public holidays: Closed [email protected]'), ('USANA Philippines', 'UHS Essential Health Philippines, Inc. 24th Floor, Tower 1, The Enterprise Center, 6766 Ayala Avenue corner Paseo de Roxas, Makati City, Philippines 1200 Customer Service: (632) 858-4500 Phone Order Line (632) 858-4599 Fax Order Line Office Hours Business Center Monday-Friday 11:00am to 8:00pm Saturday 9:00 am to 1:00pm Customer Service Monday-Friday 11:00am to 8:00pm Saturday 9:00 am to 1:00pm [email protected]'), ('USANA New Zealand', 'P.O. Box 17409, Greenlane 1546, AUCKLAND Level 1, 93 Ascot Avenue, Greenlane, Auckland 1051 Customer Service Phone: +64 9 415 2750 Toll-free: 0800 872 626 Auckland Business Center Opening Hours Mon 9:00 a.m. to 7:00 p.m. Tue 9:00 a.m. to 5:00 p.m. Wed 9:00 a.m. to 5:00 p.m. Thu 9:00 a.m. to 5:00 p.m. Fri 9:00 a.m. to 5:00 p.m. Sat 10:00 a.m. to 3:00 p.m. Sundays and public holidays: Closed [email protected]'), ('USANA Health Sciences Singapore Pte Ltd', '391B Orchard Road, Ngee Ann City Tower B, #19-01/02 Singapore 238874 Customer Service: (65) 6820-8828 Fax: (65) 6820-7007 Business Hours: Mon to Fri - 1230hr to 2030hr Saturday - 1030hr to 1400hr Sunday / Public Holiday - Closed [email protected]'), ('USANA Health Sciences (Thailand) Ltd.', 'Unit 01-04  Chamchuri Square Building  319 Phyathai Road  Pathumwan, Bangkok 10330    Distributor Services: 02-761-4300     Customer Service and Will Call  Monday–Friday: 11.00 a.m.-8.00 p.m.  Saturday: 1.00 p.m.-5.00 p.m. Closed Sunday and national holidays [email protected]'), ('USANA Health Sciences Indonesia', 'Menara Jamsostek South Tower 14th Floor Jalan Gatot Subroto Kav 38 Jakarta 12710 Indonesia Contact Information Reception Phone: +62 21 278 38 600 Customer Service Call Center: 1500847 Customer Service Fax: +62 21 278 38 688 Office Hours Business Centre Monday to Friday: 11:00 a.m.–11:00 p.m. Saturday: 9:00 a.m.–6:00 p.m. Customer Service Monday to Friday: 11:00 a.m.–8:00 p.m. Saturday: 9:00 a.m.–1:00p.m Facebook https://www.facebook.com/officialusanaindonesia [email protected]'), ('USANA Netherlands', 'USANA Health Sciences  92, avenue des Ternes  Paris, France 75017    Distributor Services: 0800-022-7288  Fax: 001-801-954-7240     Customer Service Hours  Monday through Friday  Reception, meeting rooms and will call: Tuesday, Thursday and Friday 12:30 p.m.–8:00 p.m.  Wednesday 12:30 pm – 4:00 pm.  Saturday 10:00 a.m.–5:00 p.m.  Call center: 9:00 a.m.– midnight (GMT+1). [email protected]'), ('USANA United Kingdom', 'USANA United Kingdom  Customer service representatives located in Salt Lake City, Utah, support the Associates and Preferred Customers in the United Kingdom.    Customer Service: 08 08 234 4478    Fax: 08 08 234 2472     Opening Hours:  Monday – Friday (excluding some holidays) 1:30 p.m. to 2:00 a.m. (London time).     Customer Service Hours  Monday through Friday 6:30 AM to 9:00 PM MST. [email protected]'), ('USANA France/Belgium, Paris Office', 'USANA Europe (Paris Office) \n121 Av. Des Champs Élysées \n75008 Paris, France \nDoor Code: B152 \nOrder pick-up:  \nThursday 12:30 - 10:00pm \nFriday 12:30 - 8:00pm \nSaturday 12:30 - 7:00pm \nMeeting rooms upon reservation: \[email protected]  \n \nCustomer Service:  \nFrance: +33 1 42 99 76 50  \nRomania: +40 312 295 242  \nGermany: 0800 1825899 \nBelgium: 0 800 14 432 \nSpain: 900 941 696  \nItaly: 800 790 241  \nUK: 08 08 234 4478 (calls to SLC office) [email protected]'), ('USANA United States', '3838 West Parkway Boulevard  Salt Lake City, UT 84120     Receptionist and Investor Relations  Phone: 801-954-7100  Fax: 801-954-7300  [email protected]  Hours: 9:00 am – 5:00 pm MST     Customer Service  Phone: 1-888-950-9595  Fax: 1-800-289-8081  Languages available: English, Spanish, French, Mandarin, Cantonese, & Korean  Hours: 6:30am – 9:00pm MST [email protected]'), ('USANA Puerto Rico', '3838 West Parkway Boulevard  Salt Lake City, UT 84120     Receptionist and Investor Relations  Phone: 801-954-7100  Fax: 801-954-7300  [email protected]  Hours: 9:00 am – 5:00 pm MST     Customer Service  Phone: 1-888-950-9595  Fax: 1-800-289-8081  Languages available: English, Spanish, French, Mandarin, Cantonese, & Korean  Hours: 6:30am – 9:00pm MST [email protected]'), ('Caribbean', 'Customer Service (toll-free): 1-888-950-9595  Fax: 1-801-954-7300     Trinidad and Tobago    Customer Service (toll-free): 1-888-667-3574  Fax: 1-801-954-7300 The Dominican Republic    Customer Service (toll-free): 1-888-751-2425  Fax: 1-801-954-7300     Office Hours  Monday–Friday (excluding some holidays) 6:30 a.m.–9:00 p.m. (MST/MDT). The Caribbean is two hours ahead of Mountain Time. [email protected]'), ('USANA Canada, Ontario Office', '80 Innovation Dr.  Woodbridge, ON  L4H0T2  CANADA     Customer Service  Phone: 1-888-950-9595  Fax: 1-800-289-8081  Languages available: English, Spanish, French, Mandarin, Cantonese, & Korean  Hours: 6:30am – 9:00pm MST     Investor Relations  Phone: 801-954-7100  Fax: 801-954-7300  [email protected]  Hours: 8:00 am – 6:00 pm MST   Office Hours Monday: 9:00 a.m. to 5:00 p.m.*  Tuesday: 9:00 a.m. to 5:00 p.m.*  Wednesday: 9:00 a.m. to 5:00 p.m. Thursday: 9:00 a.m. to 5:00 p.m. Friday: 9:00 a.m. to 5:00 p.m.* Office hours are in EDT/EST *Associates are asked to contact the office if they require assistance prior to 9 a.m. or after 5 p.m., as we will be pleased to make arrangements to accommodate them. [email protected]'), ('USANA Canada, Vancouver Office', 'Suite 2118, 13353 Commerce Parkway  Richmond, British Columbia  CANADA V6V 3A1     Customer Service  Phone: 1-888-950-9595  Fax: 1-800-289-8081  Languages available: English, Spanish, French, Mandarin, Cantonese, & Korean  Hours: 6:30am – 9:00pm MST     Investor Relations  Phone: 801-954-7100  Fax: 801-954-7300  [email protected]  Hours: 8:00 am – 6:00 pm MST Office Hours \nMonday to Friday, 11:00 a.m. to 7:00 p.m. (PDT/PST) \n [email protected]'), ('USANA Mexico S.A. de C.V.', 'Av. paseo de las Palmas 525, piso no. 8  Col. Lomas de Chapultepec, Del. Miguel Hidalgo  México D.F. C.P. 11000    Reception: (55) 5093-9650  Distributor Services / Order Express: 01 800 08 USANA (87262)  Fax: 01 800 08 USANA (87262)     Office Hours  Monday–Friday 9:00 a.m. to 6:00 p.m. (CST/CDT)  Saturday 9:00 a.m. to 2:00 p.m. (CST/CDT)—Will Call only [email protected]'), ('USANA Health Sciences Colombia, S.A.S.', 'Calle 100 No. 13 - 76 Piso 4to  Torre Mansarovar  Bogotá D.C. Colombia     Main Office  Phone: (57) 1-546-3939  Fax: (57) 1-546-3951     Distributor Services  Phone: (57) 1-546-3939  Toll Free: 01 8000 963750  Fax: (57) 1-546-3950     Office Hours  Monday – Friday (Not holidays) / 9:00 a.m. – 6:00 p.m. COT  Saturday / 9:00 a.m. – 12:30 p.m. COT     Customer Service Hours  Monday - Friday 9:00 a.m. – 6:00 p.m. COT  Closed Sundays and Holidays [email protected]'), ('', ''), ('', ''), ('', ''), ('', '')]

Updated in response to comment

If you want the complete text of the page replace the last four lines above with:

print(soup.get_text(separator=" ").strip())
driver.quit()

Upvotes: 1

Related Questions