Reputation: 101
I am trying to get all hrefs listed in a series of html element blocks. I don't know how to refer to the href as a selector, and I know the hrefs all begin with "/wiki/".
I was wondering if there was a way to query the page for all hrefs that begin with this specific start to the href.
Upvotes: 3
Views: 7218
Reputation: 57344
Nowadays, locators are preferred since they'll auto-wait for the elements to be attached:
wiki_links = page.locator('a[href^="/wiki/"]').evaluate_all(
"els => els.map(el => el.href)"
)
You can also use .getAttribute("href")
rather than .href
if you don't want the base URL included.
Upvotes: 0
Reputation: 3222
You can do:
hrefs_of_page = page.eval_on_selector_all("a[href^='/wiki/']", "elements => elements.map(element => element.href)")
which should work for your use-case. This will lookup for all the link tags which have a href
attribute which starts with /wiki
. Then on the browser side JavaScript gets evaluated which maps from an array of elements to the href attribute so a string array gets returned on the Python side.
Upvotes: 5