miranda_
miranda_

Reputation: 33

Scraping HTML table with Ruby and Nokogiri

so I'm working on a project that scrapes data from a website that has gun accident/death data. Here's what the website looks like: http://www.gunviolencearchive.org/officer-involved-shootings

I'm trying to grab each table row and make an object(instance?, sorry I'm new to ruby) with the data from that row and print it out into the console. Right now, the @occurances array returns an array of the same data 26 times. Clearly it is overwriting with the first row. How would you suggest that I store each of these instances?

Here is my code, the (choice) is the website address.

 def self.data_from_choice(choice)
        doc = Nokogiri::HTML(open(choice))
        @occurances = []
        doc.xpath("//tr").each do |x|
          date = doc.css("td")[0].text
          state = doc.css("td")[1].text
          city = doc.css("td")[2].text
          deaths = doc.css("td")[4].text
          injured = doc.css("td")[5].text
          source = doc.search(".links li.last a").attr("href").value
          @occurances << {:date => date, :state => state, :city => city, :deaths => deaths, :injured => injured, :source => source}
        end
        puts @occurances
      end

Upvotes: 3

Views: 1245

Answers (1)

matt
matt

Reputation: 79733

In the loop for each row you are calling doc.css(...). This causes a search from the top of the document each time (i.e. from doc). What I think you want is to make the search relative to the row, which you have in the x variable.

So change this:

date = doc.css("td")[0].text

to this

date = x.css("td")[0].text

and similarly for state, city etc.

Upvotes: 2

Related Questions