SIndhu
SIndhu

Reputation: 687

Trying to get an image from a page using YQL xpath

I'm trying to get the src of an imdb image using YQL. I'm not sure what the XPath should be - is it the XPath that Firebug gives you? Can you tell me why this fails and what is the correct XPath? Thanks

<!DOCTYPE html>
<html>
    <head>
        <title></title>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
         <script src="//ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>

    </head>
    <body>
        <script>

            $.getJSON(
            'http://query.yahooapis.com/v1/public/yql?callback=?',
            {
              q: 'select * from html where url="http://www.imdb.com/find?q=back+to+the+future&s=all" and xpath="/html/body/div[1]/div/div[4]/div[3]/div[1]/div/div[2]/table/tbody/tr[1]/td[1]/a/img"',

              format: 'json'
            },
            function(data) {
              console.log(data.query.results)
            }
          );

        </script>

        <div id='yqlresult'>

        </div>

    </body>

</html>

Upvotes: 0

Views: 599

Answers (1)

dirkk
dirkk

Reputation: 6218

Well, it would help if you would mention what you actually want to get back. For now, I will simply assume you are looking for the first picture in this list. You can get it using the following XPath, which is not only working, but much more stable that the XPaths you provided. For example, what would happen if IMDb decides to change or inserts some div elements? Your XPath would most likely be invalid.

This, however, should work:

(//td[@class="primary_photo"]/a/img)[1]

It selects all primary_photo and just returns the first one.

By the why, the reason why your XPath is not working is explained here: Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing?

Upvotes: 1

Related Questions