user3204659
user3204659

Reputation: 41

Scrapy, scraping price data from StubHub

I've been having a difficult time with this one.

I want to scrape all the prices listed for this Bruno Mars concert at the Hollywood Bowl so I can get the average price.

http://www.stubhub.com/bruno-mars-tickets/bruno-mars-hollywood-hollywood-bowl-31-5-2014-4449604/

I've located the prices in the HTML and the xpath is pretty straightforward but I cannot get any values to return.

I think it has something to do with the content being generated via javascript or ajax but I can't figure out how to send the correct request to get the code to work.

Here's what I have:

from scrapy.spider import BaseSpider
from scrapy.selector import Selector

from deeptix.items import DeeptixItem

class TicketSpider(BaseSpider):
    name = "deeptix"
    allowed_domains = ["stubhub.com"]
    start_urls = ["http://www.stubhub.com/bruno-mars-tickets/bruno-mars-hollywood-hollywood-bowl-31-5-2014-4449604/"]

def parse(self, response):
    sel = Selector(response)
    sites = sel.xpath('//div[contains(@class, "q_cont")]')
    items = []
    for site in sites:
        item = DeeptixItem()
        item['price'] = site.xpath('span[contains(@class, "q")]/text()').extract()
        items.append(item)
    return items

Any help would be greatly appreciated I've been struggling with this one for quite some time now. Thank you in advance!

Upvotes: 2

Views: 4416

Answers (1)

alecxe
alecxe

Reputation: 474161

According to the chrome network console, there is a AJAX request that loads all of the info about the event, including ticket information.

You don't need scrapy at all, just urllib2 to get the data and json module to extract ticket prices:

import json
from pprint import pprint
import urllib2

url = 'http://www.stubhub.com/ticketAPI/restSvc/event/4449604'

data = json.load(urllib2.urlopen(url))
tickets = data['eventTicketListing']['eventTicket']

prices = [ticket['tc']['amount'] for ticket in tickets]
pprint(sorted(prices))

prints:

[156.0,
 159.0,
 169.0,
 175.0,
 175.0,
 194.5,
 199.0,
 ...
]

Hope that helps.

Upvotes: 1

Related Questions