Tareq
Tareq

Reputation: 1

How to extract only the "https" contained link using beautifulsoup?

import requests
from bs4 import BeautifulSoup
page = requests.get("https://evaly.com.bd/")


soup = BeautifulSoup(page.content, 'html.parser')

for link in soup.find_all('a', href=True):
    print (link['href'])

Result of the code:

Result of the code

Need only the https contained link not the marked rectangle box in the image.

Upvotes: 0

Views: 76

Answers (2)

bigbounty
bigbounty

Reputation: 17368

Another way of achieving it using regular expression

import requests, re
from bs4 import BeautifulSoup

res = requests.get("https://evaly.com.bd/")
soup = BeautifulSoup(res.content, 'html.parser')

for a in soup.find_all("a", href = re.compile("^https://*")):
    print(a["href"])

Output:

https://merchant.evaly.com.bd/
https://www.facebook.com/groups/EvalyHelpDesk/
https://play.google.com/store/apps/details?id=bd.com.evaly.ebazar
https://evaly.com.bd/
https://evaly.com.bd/hot-deal
https://evaly.com.bd/premium-deal
https://evaly.com.bd/hot-deal
https://evaly.com.bd/premium-deal
https://evaly.com.bd/hot-deal
https://evaly.com.bd/campaign/shop/samsung-note-20-for-hot-deal/samsung-note20-for-hot-deal-058bbc
https://evaly.com.bd/premium-deal
https://evaly.com.bd/campaign/shop/rancon-motors-for-mega-deal-pod/rancon-motors-for-mega-deal-pod-be211b
https://evaly.com.bd/premium-deal
https://play.google.com/store/apps/details?id=bd.com.evaly.ebazar
https://evaly.com.bd/
https://play.google.com/store/apps/details?id=bd.com.evaly.evalyshop
https://apps.apple.com/app/id1504042677
https://www.facebook.com/evaly.com.bd/
https://www.instagram.com/evaly.com.bd/
https://www.youtube.com/channel/UCYxO44JS4_6CLXFKVmZJ7Vg

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195428

You can use .select method with CSS selector:

import requests
from bs4 import BeautifulSoup


page = requests.get("https://evaly.com.bd/")
soup = BeautifulSoup(page.content, 'html.parser')

for link in soup.select('a[href^="https://"]'):
    print (link['href'])

Prints:

https://merchant.evaly.com.bd/
https://www.facebook.com/groups/EvalyHelpDesk/
https://play.google.com/store/apps/details?id=bd.com.evaly.ebazar
https://evaly.com.bd/
https://evaly.com.bd/hot-deal
https://evaly.com.bd/premium-deal
https://evaly.com.bd/hot-deal
https://evaly.com.bd/premium-deal
https://evaly.com.bd/hot-deal
https://evaly.com.bd/campaign/shop/samsung-note-20-for-hot-deal/samsung-note20-for-hot-deal-058bbc
https://evaly.com.bd/premium-deal
https://evaly.com.bd/campaign/shop/rancon-motors-for-mega-deal-pod/rancon-motors-for-mega-deal-pod-be211b
https://evaly.com.bd/premium-deal
https://play.google.com/store/apps/details?id=bd.com.evaly.ebazar
https://evaly.com.bd/
https://play.google.com/store/apps/details?id=bd.com.evaly.evalyshop
https://apps.apple.com/app/id1504042677
https://www.facebook.com/evaly.com.bd/
https://www.instagram.com/evaly.com.bd/
https://www.youtube.com/channel/UCYxO44JS4_6CLXFKVmZJ7Vg

Upvotes: 2

Related Questions