Reputation: 11
I'm learning Python and I'm following this online class lesson.
https://openclassrooms.com/fr/courses/7168871-apprenez-les-bases-du-langage-python/exercises/4173
At the end of the lesson, we're learning the ETL procedure.
Question 3: I have to load an HTML script and use BeautifulSoup in a Python script.
The problem is there: the only thing I've done when it comes to data mining is with a website, I create a variable that contains the URL link of the website and after that I create a variable soup
.
import requests
from bs4 import BeautifulSoup
url = 'https://www.gov.uk/search/news-and-communications'
reponse = requests.get(url)
page = reponse.content
soup = BeautifulSoup(page, 'html.parser')
This is easy because the HTML code is in a URL but how can I do that with a file inside my machine?
TestOC.html
)from bs4 import BeautifulSoup
soup = BeautifulSoup('TestOC.html', 'html.parser')
But the file is not taken. How can I do that?
Upvotes: 1
Views: 491
Reputation: 311823
BeautifulSoup takes the content, not the file name. You could open
it yourself and read()
it though:
with open('TestOC.html') as f:
content = f.read()
soup = BeautifulSoup(content, 'html.parser')
Upvotes: 2