Missing information in scraped web data, Google translate, Using Python

I want to scrape the Google translate website and get the translated text from it using Python 3.

Here is my code:

from bs4 import BeautifulSoup as soup
from urllib.request import Request as uReq
from urllib.request import urlopen as open


my_url = "https://translate.google.com/#en/es/I%20am%20Animikh%20Aich"

req = uReq(my_url, headers={'User-Agent':'Mozilla/5.0'})
uClient = open(req)
page_html = uClient.read()
uClient.close()
html = soup(page_html, 'html5lib')
print(html)

Unfortunately, I am unable to find the required information in the parsed Webpage. In chrome "Inspect", It is showing that the translated text is inside:

 <span id="result_box" class="short_text" lang="es"><span class="">Yo soy Animikh Aich</span></span>

However, When I am searching for the information in the parsed HTML code, this is what I'm finding in it:

<span class="short_text" id="result_box"></span>

I have tried parsing using all of html5lib, lxml, html.parser. I have not been able to find a solution for this. Please help me with the issue.

Upvotes: 2

Answers (3)

SIM

Reputation: 22440

Try like below to get the desired content:

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://translate.google.com/#en/es/I%20am%20Animikh%20Aich")
soup = BeautifulSoup(driver.page_source, 'html5lib')
item = soup.select_one("#result_box span").text
print(item)
driver.quit()

Output:

Yo soy Animikh Aich

Upvotes: 1

Keyur Potdar

Reputation: 7238

JavaScript is modifying the HTML code after it loads. urllib can't handle JavaScript, you'll have to use Selenium to get the data that you want.

For installation and demo, refer this link.

Upvotes: 1

Lupanoide

Reputation: 3212

you could use a specific python api:

import goslate
gs = goslate.Goslate()
print(gs.translate('I am Animikh Aich', 'es'))
Yo soy Animikh Aich

Upvotes: 2

Missing information in scraped web data, Google translate, Using Python

Answers (3)

Related Questions