Nyxynyx
Nyxynyx

Reputation: 63647

Scrapy extracting the wrong IMG SRC

I'm trying to use Scrapy to get the URLs of images on a page with ID HERO_PHOTO. The target element has the following HTML code

<img alt="Photo of Gray Line" style="position: relative; left: -50px; top: 0px;" id="HERO_PHOTO" class="flexibleImage" src="https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg" width="352" height="260">

Within Chrome browser, running

$('#HERO_PHOTO').attr('src')

grabs the URL correctly

"https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg"

Problem: However using the following CSS selector in Scrapy,

response.css('#HERO_PHOTO::attr(src)').extract_first()

and

response.css('#HERO_PHOTO').xpath('@src').extract_first()

and

response.css('#HERO_PHOTO[src]').extract_first()

is giving us

https://static.tacdn.com/img2/x.gif

Using .extract() also returned the same incorrect URL.

Why is Scrapy grabbing a different SRC value?

Upvotes: 1

Views: 1142

Answers (2)

paul trmbrth
paul trmbrth

Reputation: 20748

The image links are in the page, but not directly as <img> tags. There are indeed processed with some JavaScript code. There is a JavaScript snippet inside the HTML with the image links you want (reformatted a bit):

...
}(window,ta));
</script>
<script type="text/javascript">
var lazyImgs = [{
    "data": "//maps.google.com/maps/api/staticmap?&channel=ta.desktop&zoom=15&size=340x225&client=gme-tripadvisorinc&sensor=falselanguageParam&center=45.503395,-73.573174&maptype=roadmap&&markers=icon:http%3A%2F%2Fc1.tacdn.com%2Fimg2%2Fmaps%2Ficons%2Fpin_v2_CurrentCenter.png|45.503395,-73.57317&signature=FqI7Z1egbpsVrlEE0yjw9HmsMJ8=",
    "scroll": false,
    "tagType": "img",
    "id": "lazyload_1098682971_0",
    "priority": 500,
    "logerror": false
}, {
    "data": "//ad.atdmt.com/i/img;p=11007200799198;cache=?ord=1475487471489",
    "scroll": false,
    "tagType": "img",
    "id": "lazyload_1098682971_1",
    "priority": 1000,
    "logerror": false
}, {
    "data": "//ad.doubleclick.net/ad/N4764.TripAdvisor/B7050081;sz=1x1?ord=1475487471489",
    "scroll": false,
    "tagType": "img",
    "id": "lazyload_1098682971_2",
    "priority": 1000,
    "logerror": false
}, {
    "data": "https://static.tacdn.com/img2/maps/icons/spinner24.gif",
    "scroll": false,
    "tagType": "img",
    "id": "lazyload_1098682971_3",
    "priority": 100,
    "logerror": false
}, {
    "data": "https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg",
    "scroll": false,
    "tagType": "img",
    "id": "HERO_PHOTO",
    "priority": 100,
    "logerror": false
}, {
    "data": "https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/98/montreal-night-tour.jpg",
    "scroll": false,
    "tagType": "img",
    "id": "THUMB_PHOTO1",
    "priority": 100,
    "logerror": false
}, {
    "data": "https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/8f/montreal-night-tour.jpg",
    "scroll": false,
    "tagType": "img",
    "id": "THUMB_PHOTO2",
    "priority": 100,
    "logerror": false
}, {
    "data": "https://static.tacdn.com/img2/generic/site/no_user_photo-v1.gif",
    "scroll": false,
    "tagType": "img",
    "id": "lazyload_1098682971_4",
    "priority": 100,
    "logerror": false
}...

One way to parse this is to use js2xml:

from pprint import pprint
# get all `<script>`s content 
for js in response.xpath('.//script[@type="text/javascript"]/text()').extract():
    try:
        jstree = js2xml.parse(js)

        # look for assignment of `var lazyImgs`
        for imgs in jstree.xpath('//var[@name="lazyImgs"]/*'):

            # use js2xml.make_dict() -- poor name I know
            # to build a useful Python object
            data = js2xml.make_dict(imgs)

            pprint(data)

            break

    except Exception as e:
        pass

This is what you get out:

[{'data': '//maps.google.com/maps/api/staticmap?&channel=ta.desktop&zoom=15&size=340x225&client=gme-tripadvisorinc&sensor=falselanguageParam&center=45.503395,-73.573174&maptype=roadmap&&markers=icon:http%3A%2F%2Fc1.tacdn.com%2Fimg2%2Fmaps%2Ficons%2Fpin_v2_CurrentCenter.png|45.503395,-73.57317&signature=FqI7Z1egbpsVrlEE0yjw9HmsMJ8=',
  'id': 'lazyload_-1977833463_0',
  'logerror': False,
  'priority': 500,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/maps/icons/spinner24.gif',
  'id': 'lazyload_-1977833463_1',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-s/04/71/70/7c/gray-line-tours-montreal.jpg',
  'id': 'HERO_PHOTO',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/98/montreal-night-tour.jpg',
  'id': 'THUMB_PHOTO1',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-s/0c/f5/19/8f/montreal-night-tour.jpg',
  'id': 'THUMB_PHOTO2',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/generic/site/no_user_photo-v1.gif',
  'id': 'lazyload_-1977833463_2',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/08/38/19/cb/gayle-h.jpg',
  'id': 'lazyload_-1977833463_3',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_01.png',
  'id': 'lazyload_-1977833463_4',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_02.png',
  'id': 'lazyload_-1977833463_5',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_6',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_7',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/b1/32/93/holidays1958.jpg',
  'id': 'lazyload_-1977833463_8',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_04.png',
  'id': 'lazyload_-1977833463_9',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_04.png',
  'id': 'lazyload_-1977833463_10',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_11',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_12',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_13',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/06/4d/bc/f6/disneybus.jpg',
  'id': 'lazyload_-1977833463_14',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_06.png',
  'id': 'lazyload_-1977833463_15',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_06.png',
  'id': 'lazyload_-1977833463_16',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_17',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_18',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_19',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/a7/avatar078.jpg',
  'id': 'lazyload_-1977833463_20',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_01.png',
  'id': 'lazyload_-1977833463_21',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_22',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_23',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/9f/avatar070.jpg',
  'id': 'lazyload_-1977833463_24',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_02.png',
  'id': 'lazyload_-1977833463_25',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_03.png',
  'id': 'lazyload_-1977833463_26',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_27',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_28',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/03/9f/a6/94/facebook-avatar.jpg',
  'id': 'lazyload_-1977833463_29',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_04.png',
  'id': 'lazyload_-1977833463_30',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_05.png',
  'id': 'lazyload_-1977833463_31',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_32',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_33',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_34',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/06/f3/32/86/complsv.jpg',
  'id': 'lazyload_-1977833463_35',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_04.png',
  'id': 'lazyload_-1977833463_36',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_05.png',
  'id': 'lazyload_-1977833463_37',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_38',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_39',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_40',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/05/f2/4d/68/christine-n.jpg',
  'id': 'lazyload_-1977833463_41',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_03.png',
  'id': 'lazyload_-1977833463_42',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_04.png',
  'id': 'lazyload_-1977833463_43',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_44',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_45',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_46',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/80/avatar001.jpg',
  'id': 'lazyload_-1977833463_47',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_03.png',
  'id': 'lazyload_-1977833463_48',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_04.png',
  'id': 'lazyload_-1977833463_49',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_50',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_51',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_52',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/0a/45/46/e2/tracey-g.jpg',
  'id': 'lazyload_-1977833463_53',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/lvl_06.png',
  'id': 'lazyload_-1977833463_54',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/rev_06.png',
  'id': 'lazyload_-1977833463_55',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/FunLover.png',
  'id': 'lazyload_-1977833463_56',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/badges/20px/Appreciated.png',
  'id': 'lazyload_-1977833463_57',
  'logerror': False,
  'priority': 100,
  'scroll': False,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/gray_flag.png',
  'id': 'lazyload_-1977833463_58',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-f/02/6d/40/b2/montreal-amphi-bus-tour.jpg',
  'id': 'lazyload_-1977833463_59',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/39/2d/43/old-montreal-walking.jpg',
  'id': 'lazyload_-1977833463_60',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/06/df/96/c7/excursions-montreal-private.jpg',
  'id': 'lazyload_-1977833463_61',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/02/ad/57/0a/filename-p1010076-jpg.jpg',
  'id': 'lazyload_-1977833463_62',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/04/b5/6a/8d/ali-l.jpg',
  'id': 'lazyload_-1977833463_63',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/87/avatar008.jpg',
  'id': 'lazyload_-1977833463_64',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/06/8a/c5/7d/leonard-d.jpg',
  'id': 'lazyload_-1977833463_65',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-o/05/6d/32/ca/rpm13111.jpg',
  'id': 'lazyload_-1977833463_66',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/87/avatar008.jpg',
  'id': 'lazyload_-1977833463_67',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/neighborhood/icon_hood_white.png',
  'id': 'lazyload_-1977833463_68',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/oyster/500/08/5b/34/b0/sherbrooke-street-west-shopping--.jpg',
  'id': 'lazyload_-1977833463_69',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/maps/icons/icon_mapControl_expand_idle_30x30.png',
  'id': 'lazyload_-1977833463_70',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/maps/icons/icon_mapControl_expand_hover_30x30.png',
  'id': 'lazyload_-1977833463_71',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/a1/f2/6b/marche-atwater.jpg',
  'id': 'lazyload_-1977833463_72',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/01/41/78/a3/mcgill-university-lower.jpg',
  'id': 'lazyload_-1977833463_73',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/04/06/16/08/musee-grevin.jpg',
  'id': 'lazyload_-1977833463_74',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/03/4a/9a/85/laurie-raphael.jpg',
  'id': 'lazyload_-1977833463_75',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/09/45/53/16/cafe-humble-lion.jpg',
  'id': 'lazyload_-1977833463_76',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://media-cdn.tripadvisor.com/media/photo-l/03/2f/37/03/essence.jpg',
  'id': 'lazyload_-1977833463_77',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/branding/logo_with_tagline.png',
  'id': 'LOGOTAGLINE',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'},
 {'data': 'https://static.tacdn.com/img2/icons/bell.png',
  'id': 'lazyload_-1977833463_78',
  'logerror': False,
  'priority': 100,
  'scroll': True,
  'tagType': 'img'}]

Upvotes: 3

ddeamaral
ddeamaral

Reputation: 1443

I believe you are using the wrong css selector. Looking at w3 schools it seems to select your attribute you want [src].

Try this.

response.css('#HERO_PHOTO[src]').extract_first()

my next suggestion is to see what you get without using the extract_first(). See if it's in the return value of response.css('#HERO_PHOTO[src]')

EDIT: I think the issue you're experiencing is you are querying the page source, not the rendered html. Here's a link to what I believe is happening.

This Questions first answer

You are querying what the server had responded, not what JavaScript has had a chance to manipulate.

Upvotes: 0

Related Questions