taniiit
taniiit

Reputation: 53

Getting a specific value

hey i want to get the last href value in the navigation bar which is this part "/cat/cremes-solaires-indice-30" but i got error that the string index out of range thank you for helping me :)

from bs4 import BeautifulSoup as soup
PAGE_URL = https://www.e.leclerc/fp/sunissime-bb-fluide-protecteur-anti-age-global-spf30-40ml-3508240006457
def product():
    response = requests.get(PAGE_URL, headers=HEADERS)
    soupe = soup(response.content, 'html5lib')
    sauce = soupe.find_all("div", class_="product-breadcrumb")
    for x in sauce:
        a = x.select('[href]')
        print(type(a))
        for i in a:
            u = i.get("href")
            print(u[-2])```

Upvotes: 2

Views: 61

Answers (2)

Prince Kumar
Prince Kumar

Reputation: 49

The object 'u' is not a dictionary or any other data set storing object , So every time the loop repeats the variable u is being over written . So you can either declare a list or can make the variable global .

Method 1:

from bs4 import BeautifulSoup as soup
PAGE_URL = https://www.e.leclerc/fp/sunissime-bb-fluide-protecteur-anti-age-global-spf30-40ml-3508240006457
def product():
    response = requests.get(PAGE_URL, headers=HEADERS)
    soupe = soup(response.content, 'html5lib')
    sauce = soupe.find_all("div", class_="product-breadcrumb")
    for x in sauce:
       a = x.select('[href]')
       print(type(a))
       for i in a:
         global u
         u = i.get("href")
       print(u)

Method 2:

from bs4 import BeautifulSoup as soup
PAGE_URL = https://www.e.leclerc/fp/sunissime-bb-fluide-protecteur-anti-age-global-spf30-40ml-3508240006457
def product():
    response = requests.get(PAGE_URL, headers=HEADERS)
    soupe = soup(response.content, 'html5lib')
    sauce = soupe.find_all("div", class_="product-breadcrumb")
    for x in sauce:
        a = x.select('[href]')
        u = []
        print(type(a))
        for i in a:
            u.append(i.get("href"))
        print(u[-1])

P.S : Your code is missing many thing else

Upvotes: 1

Bhavya Parikh
Bhavya Parikh

Reputation: 3400

You can find main p tag and from it use find_all method on a tag where it returns list of tag from it extract href

from bs4 import BeautifulSoup
import requests

headers={"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"}
req = requests.get('https://www.e.leclerc/fp/sunissime-bb-fluide-protecteur-anti-age-global-spf30-40ml-3508240006457',headers=headers)

soup = BeautifulSoup(req.text, 'html.parser')

soup.find("p",class_="wrapper d-inline-block").find_all("a")[-1]['href']

Output:

'/cat/cremes-solaires-indice-30'

Upvotes: 2

Related Questions