Antonialieb
Antonialieb

Reputation: 47

Parsing links from website, and outputting a specific one as variable in Python

I'm stuck again with my first attempts in web scraping with python.

url = link
page = requests.get(url)
soup = BeautifulSoup(page.content, features="lxml")
checkout_link = []
links = soup.find_all("a")
for url in soup.find_all('a'):
    if url.get('href') == None:
        pass
    elif len(url.get('href')) >= 200:
        checklist += 10
        for search in links:
            if "checkout" in search.get("href"):
                checkout_link = search.get("href")
            else:
                pass
    else:
        pass

So this is my code right now. The parsing of all links works fine (I want this part to check how many links are available in total and thought it would be a good method to do both in a single request. Correct me if I'm attempting this the wrong way), even if I search for the checkout link and print it I get the correct link reference printed but I can't find a solution to store it in checkout_link to use it further on. I want to make a request into this specific checkout url afterwards.

Upvotes: 0

Views: 85

Answers (1)

QHarr
QHarr

Reputation: 84465

You need to append it to the list

checkout_link.append(search.get("href"))

Consider doing href filtering via attribute selector with * contains operator:

soup.select_one("[href*=checkout]")['href']

Upvotes: 1

Related Questions