sayth
sayth

Reputation: 7048

Trying to select element based on link with class list object - beautifulsoup

I am using Beautifulsoup 4.4 and python 3.6.6. I have extracted all the links however I cannot print out all links which contain

'class': ['_self']

This is the full link that is retrieved that I want to capture out of the list of links.

{'href': 'https://www.racingnsw.com.au/news/latest-racing-news/highway-sixtysix-on-right-route/', 'class': ['_self'], 'target': '_self'}

I cannot get the syntax correct although it looks like the bs4 docs on attributes.

import requests as req
import json
from bs4 import BeautifulSoup

url = req.get(
    'https://www.racingnsw.com.au/media-news-premierships/latest-news/')

data = url.content

soup = BeautifulSoup(data, "html.parser")

links = soup.find_all('a')

for item in links:
    print(item['class']='self')

Upvotes: 0

Views: 304

Answers (1)

Pruthvi Kumar
Pruthvi Kumar

Reputation: 898

BeautifulSoup supports CSS selectors which allow you to select elements based on the content of particular attributes. This includes the selector *= for contains.

import requests as req
from bs4 import BeautifulSoup

url = req.get(
    'https://www.racingnsw.com.au/media-news-premierships/latest-news/')

data = url.content

soup = BeautifulSoup(data, "html.parser")

for items in soup.select('a[class*="_self"]'):
    print(items)

Upvotes: 3

Related Questions