Reputation: 362
This is what the HTML looks like:
<div class="full-news none">
Demo: <a href="https://www.lolinez.com/?https://www.makemytrip.com"
rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a>
<br/>
How can I remove this part from the href: https://www.lolinez.com/?
, so that the final output becomes like this:
<div class="full-news none">
Demo: <a href="https://www.makemytrip.com"
rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a>
<br/>
I have tried using the decompose
function of beautiful soup, but it completely removes the entire tag, How can this be fixed?
Upvotes: 0
Views: 212
Reputation: 25196
Note Without additional context I would narrow down to following approaches
Replace your substring the string
that you pass to BeautifulSoup
constructor:
soup = BeautifulSoup(YOUR_STRING.replace('https://www.lolinez.com/?',''), 'lxml')
Replace the substring in your soup
you can select all the <a>
that contains www.lolinez.com
and replace the value of its href
:
for x in soup.select('a[href*="www.lolinez.com"]'):
x['href'] = x['href'].replace('https://www.lolinez.com/?','')
import bs4, requests
from bs4 import BeautifulSoup
html='''
<a href="https://www.lolinez.com/?https://www.makemytrip.com" rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a>
<a href="https://www.makemytrip.com" rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a>
<a href="https://www.lolinez.com/?https://www.makemytrip.com" rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a>
'''
soup = BeautifulSoup(html, 'lxml')
for x in soup.select('a[href*="www.lolinez.com"]'):
x['href'] = x['href'].replace('https://www.lolinez.com/?','')
soup
<html><body><a href="https://www.makemytrip.com" rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a><a href="https://www.makemytrip.com" rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a><a href="https://www.makemytrip.com" rel="external noopener noreferrer" target="_blank">https://www.makemytrip.com</a></body></html>
Upvotes: 2