How to take link from onclickvalue in BeautifulSoup?

Question

Need help scrubbing a link to an image that is stored in the onclick= value. I do this, but I stopped how to remove everything in onclick except for the link.

links = soup.find('div', class_='workshopItemPreviewImageMain')
links = links.findChild('a', attrs={'onclick': re.compile("^https://")})

But nothing is output.

links = soup.find('div', class_='workshopItemPreviewImageMain')
links = links.findChild('a')
links = links.get("onclick")

The entire value of onclick is displayed:

howEnlargedImagePreview( 'https://steamuserimages-a.akamaihd.net/ugc/794261971268711656/69C39CF2A2BBCDDC7C04C17DF1E88A6ED875DBE7/' )

But only a link is needed.

user5386938 · Accepted Answer

You just need to change your regular expression.

from bs4 import BeautifulSoup
import re

pattern = re.compile(r'''(?P['"])(?Phttps?://.+?)(?P=quote)''')

data = '''



'''

soup = BeautifulSoup(data, 'html.parser')

div = soup.find('div', class_='workshopItemPreviewImageMain')

links = div.find_all('a', {'onclick': pattern})

for a in links:
    print(pattern.search(a['onclick']).group('href'))

How to take link from onclickvalue in BeautifulSoup?

Answers (1)

Related Questions