Reputation: 285
At the most basic I am wanting to scrape a website and render parts of code like all the H1s or something. I have used Nokogiri and Mechanize in the past and am familiar with the basics of scraping. In the past I would structure a thor task, like this
class Scrape < Thor
desc "cl_redding","Scrape Craigslist for Rentals"
def cl_redding
require File.expand_path('config/environment.rb')
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'mechanize'
require 'yaml'
require 'aws-sdk'
require 'csv'
require 'json'
agent = Mechanize.new
page = agent.get('http://redding.craigslist.org/search/apa?zoomToPosting=&catAbb=apa&query=&minAsk=&maxAsk=&bedrooms=&housing_type=&hasPic=1&excats=')
All cool and it works, though It only scrapes craigslist and because I specifically called through the page =, what I am asking is, Does anyone have any advice on how I would scrape a site called from an input box on a website? specific help, tutorials, advice or resources welcome.
Upvotes: 0
Views: 160
Reputation: 2576
I think your question is a bit too generic.
Upvotes: 1