How to remove HTML tags from output text?

Question

Apologies if this question has already been asked before, but all the solutions I have tried did not seem to work.

I have created a program where the user enters a word, and the program pulls an example of that word from the Dictionary.com website.

I want to remove the HTML tags that always surround the keyword. How would I go about doing this?

import requests

word = input("Enter a word: ")

webContent = requests.get('https://www.dictionary.com/browse/'+word)

from bs4 import BeautifulSoup
soup = BeautifulSoup(webContent.text, 'html.parser')

results = soup.find_all('p', attrs={'class':'one-click-content css-it69we e15kc6du7'})

firstResult = results[0]
print(firstResult.contents[0:3])

Result: Result

nandu kk · Accepted Answer

import requests
import re

word = input("Enter a word: ")

webContent = requests.get('https://www.dictionary.com/browse/'+word)

from bs4 import BeautifulSoup
soup = BeautifulSoup(webContent.text, 'html.parser')

results = soup.find_all('p', attrs={'class':'one-click-content css-it69we e15kc6du7'})

firstResult = results[0]
firstResult.contents=[re.sub('<[^<]+?>', '', str(x)) for x in firstResult.contents]
print(firstResult.contents[0:3])

Result:

How to remove HTML tags from output text?

Answers (2)

Related Questions