Reputation: 11
I'm trying scrape google readers but I've got problems...I wish to log in google readers and get a valid cookie...then try enter in this page:
'http://www.google.es/reader/atom/user/-/state/com.google/reading-list'
if my cookies work and I'm logged in I only need to put "user/-/"
and it will enter inside my google reader's XML version....
It's in theory ... I log in inside google readers and it redirects ... then I copy my SID .... and I create a manual cookie using this and the google reader's API info
http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI
name SID
domain .google.com
path /
expires 1600000000
with my cookie create I try enter inside:
'http://www.google.es/reader/atom/user/-/state/com.google/reading-list'
but it don't work .... I think I'm creating my cookie in a bad way but I read the API about CookieJar
and Mechanize::Cookie
, but I don't find any example about how to use it ... I've tried in different ways but none work ... please someone can help me about how use this cookie....
Upvotes: 1
Views: 638
Reputation: 29
We do all our web scraping with iMacros (partly free/open source, partly commercial). That works well. No matter what you use, you need something that automates a real web browser. Other options are Selenium or Watir, although these are more geared towards web testing.
Upvotes: 1