Reputation: 6121
I'm going to be accessing a number of accounts on Amazon's KDP - http://kdp.amazon.com/
My task is to login to each account and check the account's earnings. Mechanize works great for logging in and dealing with the cookies and such but the page which displays the account earnings uses javascript to dynamically populate the page.
I did a little bit of digging and found that the javascripts sends out the following request:
https://kdp.amazon.com/self-publishing/reports/transactionSummary?_=1326419839161&marketplaceID=ATVPDKIKX0DER
Along with a cookie which contains a session ID, a token, and some random stuff. Every time I click a link to display the results, the numerical part of the above GET url is different, even if it's the same link.
In response to the request, the browser then receives this (cut out a bunch of it so it doesn't take up the whole page):
{"iTotalDisplayRecords":13,"iTotalRecords":13,"aaData":[["12/03/2011","<span
title=\"Booktitle\">Hold That ...<\/span>","<span
title=\"Author\">Amy
....
<\/span>","B004PGMHEM","1","1","0","70%","4.47","0.06","4.47","0.01","0.00",""],["","","","","","","","","","","","","<div
class='grandtotal'>Total: $ 39.53<\/div>","Junk"]]}
I think I can use mechanize's cookie container to extract the cookies which are a part of that request but how do I figure out what that number is and how it's generated? The javascripts in the source code of the page seem cryptic on the best of days. Here's one of them:
http://kdp.amazon.com/DTPUIFramework/js/all-signin-thin.js
Is there a way to really track down what javascripts are running "behind the scenes" so to speak after I click on something on the page so that I can emulate that request in conjunction with mechanize?
Danke..
PS: I can't (or, rather, I don't want to) use watir for this task, because in theory I might be handling more than just a handful of accounts so this's gotta be pretty snappy.
Upvotes: 0
Views: 662
Reputation: 54984
It's just a timestamp and it's only used for cache busting. Try this:
Time.now.to_i.to_s
Upvotes: 1
Reputation: 160551
Mechanize doesn't run JavaScript that is embedded in the page. It only retrieves the HTML.
If the page contains JavaScript, Mechanize can see it and you can use Nokogiri, which Mechanize uses internally, to retrieve the <script>
tags' content. But, anything that would be loaded as a result of the JavaScript being executed in a browser will not run in Mechanize. Watir is the solution for that, because it drives the browser itself, which will interpret and run the JavaScript in the page.
You can step through the pages in a browser and look at the source code to get an idea what is running using FireBug. From that information you can get an understanding of what the JavaScript is doing, and then use Mechanize and Nokogiri to extract the needed information from a page that lets you build up your next URLs, but it can be a lot of work.
What you ask is similar to many other's questions regarding Mechanize and JavaScript. I'd recommend you look at these SO links to get alternate ideas:
Or search Stack Overflow for questions about Ruby, JavaScript and Mechanize.
Upvotes: 0