Reputation: 33

API - Web Scrape

how to get access to this API:

import requests
    url = 'https://b2c-api-premiumlabel-production.azurewebsites.net/api/b2c/page/menu?id_loja=2691'
    print(requests.get(url))

I'm trying to retrieve data from this site via API, I found the url above and I can see its data , however I can't seem to get it right because I'm running into code 403. This is the website url: https://www.nagumo.com.br/osasco-lj46-osasco-ayrosa-rua-avestruz/departamentos

I'm trying to retrieve items category, they are visible for me, but I'm unable to take them. Later I'll use these categories to iterate over products API.

API Category

Obs: please be gentle it's my first post here =]

Upvotes: 3

Answers (3)

QHarr

Reputation: 84475

To get the data as you shown in your image the following headers and endpoint are needed:

import requests

headers = {   
    'sm-token': '{"IdLoja":2691,"IdRede":884}', 
    'User-Agent': 'Mozilla/5.0',
   'Referer': 'https://www.nagumo.com.br/osasco-lj46-osasco-ayrosa-rua-avestruz/departamentos',
}

params = {
    'id_loja': '2691',
}

r = requests.get('https://www.nagumo.com.br/api/b2c/page/menu', params=params, headers=headers)
r.json()

Upvotes: 1

kayak

Reputation: 11

I'm also new here haha, but besides this requests library, you'll also need another one like beautiful soup for what you're trying to do.

bs4 installation: https:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-beautiful-soup

Once you install it and import it, it's just continuing what you were doing to actively get your data.

response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

this gets the entire HTML content of the page, and so, you can get your data from this page based on their css selectors like this:

site_data = soup.select('selector')

site_data is an array of things with that 'selector', so a simple for loop and an array to add your items in would suffice (as an example, getting links for each book on a bookstore site)

For example, if i was trying to get links from a site:

import requests
from bs4 import BeautifulSoup

sites = []
URL = 'https://b2c-api-premiumlabel-production.azurewebsites.net/api/b2c/page/menu?id_loja=2691'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

links = soup.select("a")  # list of all items with this selector

for link in links:
   sites.append(link)

Also, a helpful tip is when you inspect the page (right click and at the bottom press 'inspect'), you can see the code for the page. Go to the HTML and find the data you want and right click it and select copy -> copy selector. This will make it really easy for you to get the data you want on that site.

helpful sites: https://oxylabs.io/blog/python-web-scraping https://realpython.com/beautiful-soup-web-scraper-python/

Upvotes: 0

max_settings

Reputation: 69

Not sure exactly what your issue is here. Bu if you want to see the content of the response and not just the 200/400 reponses. You need to add '.content' to your print.

Eg.

#Create Session
s = requests.Session()

#Example Connection Variables, probably not required for your use case.
setCookieUrl = 'https://www...'
HeadersJson = {'Accept-Language':'en-us'}
bodyJson = {"__type":"xxx","applicationName":"xxx","userID":"User01","password":"password2021"}


#Get Request
p = s.get(otherUrl, json=otherBodyJson, headers=otherHeadersJson)
print(p) #Print response (200 etc)
#print(p.headers)
#print(p.content) #Print the content of the response.
#print(s.cookies)

Upvotes: 0

API - Web Scrape

Answers (3)

Related Questions