roypierinni
roypierinni

Reputation: 11

How do I scrape google readers with mechanize (using cookies)

I'm trying scrape google readers but I've got problems...I wish to log in google readers and get a valid cookie...then try enter in this page:

'http://www.google.es/reader/atom/user/-/state/com.google/reading-list'

if my cookies work and I'm logged in I only need to put "user/-/" and it will enter inside my google reader's XML version....

It's in theory ... I log in inside google readers and it redirects ... then I copy my SID .... and I create a manual cookie using this and the google reader's API info

http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI

name SID
domain .google.com
path /
expires 1600000000

with my cookie create I try enter inside:

'http://www.google.es/reader/atom/user/-/state/com.google/reading-list'

but it don't work .... I think I'm creating my cookie in a bad way but I read the API about CookieJar and Mechanize::Cookie, but I don't find any example about how to use it ... I've tried in different ways but none work ... please someone can help me about how use this cookie....

Upvotes: 1

Views: 638

Answers (1)

yc08m
yc08m

Reputation: 29

We do all our web scraping with iMacros (partly free/open source, partly commercial). That works well. No matter what you use, you need something that automates a real web browser. Other options are Selenium or Watir, although these are more geared towards web testing.

Upvotes: 1

Related Questions