Taimorr Mughal
Taimorr Mughal

Reputation: 89

Getting 401 response from scrapy Request

I am trying to extract table data from this page. After navigating in network tool, I figured out that an api call could provide me the required table data so I tried to mimic request with python scrapy. Here is the code and response message.

In [27]: url                                                                    
Out[27]: 'https://www.barchart.com/proxies/core-api/v1/quotes/get?symbol=MSFT&lists=stocks.inSector.all(-COSO)&fields=symbol,symbolName,weightedAlpha,lastPrice,priceChange,percentChange,highPrice1y,lowPrice1y,percentChange1y,tradeTime,symbolCode,symbolType,hasOptions&orderBy=weightedAlpha&orderDir=desc&meta=field.shortName,field.type,field.description&hasOptions=true&page=1&limit=100&raw=1'

In [28]: headers                                                                
Out[28]: {'X-XSRF-TOKEN': 'eyJpdiI6Ims2ZVJxT3pRRUplSCtLZXRVZXA3cXc9PSIsInZhbHVlIjoiaDJaQ0hhVWQwUU9zMEQ2S1FqVEVxR3hPYTJYRzd3d0VWWkZzMUhYQmRPSGVoaWVtTnBNUXZzdkJhTngvS2xNLyIsIm1hYyI6Ijc3MzY1N2M4ZDljMWQ4MDY4OTA5ZGQwNmUzYThiNDNkMDNlZDUyZmQ1Mjc4ZTU0MzkwMjA3ZDFmMDAwMTdkYTMifQ=='}

In [29]: fetch(scrapy.Request(url,headers=headers))                             
2021-03-03 12:12:55 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://www.barchart.com/proxies/core-api/v1/quotes/get?symbol=MSFT&lists=stocks.inSector.all(-COSO)&fields=symbol,symbolName,weightedAlpha,lastPrice,priceChange,percentChange,highPrice1y,lowPrice1y,percentChange1y,tradeTime,symbolCode,symbolType,hasOptions&orderBy=weightedAlpha&orderDir=desc&meta=field.shortName,field.type,field.description&hasOptions=true&page=1&limit=100&raw=1> (referer: None)

Is there anything I am missing in headers or something elsewhere?

Upvotes: 0

Views: 730

Answers (1)

Felix Ekl&#246;f
Felix Ekl&#246;f

Reputation: 3730

When you visit https://www.barchart.com/stocks/quotes/MSFT/competitors you get get a repsponse header with set-cookie=larvel-token... and some other cookies. I tried all cookies and laravel-token is the one used for auth. You also need to x-xsrf-token that you've already extracted.

To solve your problem in Scrapy. First make sure you have cookies enabled in settings.py. Then send a request to: https://www.barchart.com/stocks/quotes/MSFT/competitors. In the parse method of that request there you send the next request to the url you sent above. Scrapy will then automatically handle the cookies.

Here's an example spider that worked for me (I extracted the xsrf token quite sloppy, you probably have a better way):

import re
from urllib.parse import unquote
import scrapy

class TestSpider(scrapy.Spider):
    name='testspider'
    
    def start_requests(self):
        yield scrapy.Request(
            url='https://www.barchart.com/stocks/quotes/MSFT/competitors',
        )

    def parse(self, response):
        for set_cookie in response.headers.getlist('Set-Cookie'):
            try:
                xsrf_token = re.findall(r'XSRF-TOKEN=(\w+==);', unquote(set_cookie.decode('utf-8')))[0]
            except IndexError:
                pass

        yield scrapy.Request(
            url='https://www.barchart.com/proxies/core-api/v1/quotes/get?'\
                'symbol=MSFT&lists=stocks.inSector.all(-COSO)&fields=symb'\
                'ol,symbolName,weightedAlpha,lastPrice,priceChange,percen'\
                'tChange,highPrice1y,lowPrice1y,percentChange1y,tradeTime'\
                ',symbolCode,symbolType,hasOptions&orderBy=weightedAlpha&'\
                'orderDir=desc&meta=field.shortName,field.type,field.desc'\
                'ription&hasOptions=true&page=1&limit=100&raw=1',
            callback=self.parse_data,
            headers={
                'x-xsrf-token': xsrf_token
            }
        )
    
    def parse_data(self, response):
        pass

Output

2021-03-03 12:26:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.barchart.com/stocks/quotes/MSFT/competitors> (referer: None)
2021-03-03 12:26:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.barchart.com/proxies/core-api/v1/quotes/get?symbol=MSFT&lists=stocks.inSector.all(-COSO)&fields=symbol,symbolName,weightedAlpha,lastPrice,priceChange,percentChange,highPrice1y,lowPrice1y,percentChange1y,tradeTime,symbolCode,symbolType,hasOptions&orderBy=weightedAlpha&orderDir=desc&meta=field.shortName,field.type,field.description&hasOptions=true&page=1&limit=100&raw=1> (referer: https://www.barchart.com/stocks/quotes/MSFT/competitors)

Upvotes: 2

Related Questions