shashank
shashank

Reputation: 410

Scrape a page after translating it using bs4

I am trying to scrape a page which is in france by converting it into english.

Here is my code using beautiful soup and requests packages in python.

import requests
from bs4 import BeautifulSoup
url = '<url>'
headers = {"Accept-Language": "en,en-gb;q=0.5"}
r = requests.get(url, headers=headers)
c = r.content
soup = BeautifulSoup(c)

but this is still giving the text in french.

can anyone suggest changes/alternative code.

Upvotes: 0

Views: 1482

Answers (1)

Ian-Fogelman
Ian-Fogelman

Reputation: 1605

You can utilize TextBlob to convert strings to various languages, an example of converting the spans from the french ebay site :

import requests
from bs4 import BeautifulSoup
from textblob import TextBlob

url = 'https://www.ebay.fr/'
french = []
english = []
r = requests.get(url)
c = r.content
soup = BeautifulSoup(c)
for li in soup.find_all('span'):
    french.append(li.text)

Frenchstr = ''.join(french)
blob = TextBlob(Frenchstr)
print(Frenchstr)
Englishstr = blob.translate(to="EN")
print('------------------------------------------------')
print(Englishstr)

Upvotes: 1

Related Questions