Andy K
Andy K

Reputation: 5044

how to replace a specific text line within a html page with beautiful soup in python

I'm total newbie with beautiful soup with python.

I'm trying to replace the line below

Assurez-vous de bien recevoir tous nos messages en ajoutant [email protected] a votre carnet d'adresses.

With

yaya toure

I've done this piece of code (see below)

from BeautifulSoup import BeautifulSoup   
import re

url = r"/cygdrive/d/ope_mdl/bsoup/test_toto.html"
page = open(url)
soup = BeautifulSoup(page.read())

soup.replace('Assurez-vous de bien recevoir tous nos messages en ajoutant [email protected] a votre carnet d\'adresses.', 'Yaya Toure')

As you see, votre carnet d'adresses. already has a '. I've put a \

However, it does not seem to replace the text.

What am I doing wrong?

Edit: Line 1 to 5 work fine. You have to create a HTML file in your local drive. Only the line 6 is creating issues for me.

Upvotes: 1

Views: 608

Answers (1)

randomusername
randomusername

Reputation: 8097

I can't seem to find BeautifulSoup.replace in pydoc. So I believe that you shouldn't be using it in your code. So instead of that use

search_text = 'Assurez-vous de bien recevoir tous nos messages en ajoutant [email protected] a votre carnet d\'adresses.'
soup.find(text=lambda x: x.startswith(search_text)).replaceWith('Yaya Toure')

Edit: Note that we have to pass the function as the text argument because your particular html file has your search string separated by more text with a <br /> in the middle of the text. This causes the text attribute to be the concatenation of the your intended string and the garbage data.

Upvotes: 1

Related Questions