RoshanPisharody
RoshanPisharody

Reputation: 59

Scraping Json Data from a REST Api

I am learning Firebase with Android and I need a database to play with. This is the Json request url :https://yts.ag/api/v2/list_movies.json . It contains around 5000 movie List that I need. So I searched around internet and I found a tool called Scrapy. But I have no idea how to use it in a rest API. Any Help is appreciated

Upvotes: 0

Views: 3683

Answers (2)

eLRuLL
eLRuLL

Reputation: 18799

First you'll need to follow the Scrapy Tutorial to create a scrapy project, and then your spider can be as simple as this:

class MySpider(Spider):
    name = 'myspider'

    start_urls = ['https://yts.ag/api/v2/list_movies.json']

    def parse(self, response):
        json_response = json.loads(response.body)
        for movie in json_response['data']['movies']:
            yield Request(movie['url'], callback=self.parse_movie)

    def parse_movie(self, response):
        # work with every movie response
        yield {'url': response.url}

Upvotes: 3

neverlastn
neverlastn

Reputation: 2204

Very easy. Follow the tutorial and start from the URL of your REST endpoint. On your parse() or parse_item() function, use json.loads(response.body) to load the JSON document. Since Scrapy now can ingest dicts, your code might be as simple as

import json
...

def parse(self, response):
    return json.loads(response.body)

Here's a slightly more advanced use case.

Upvotes: 0

Related Questions