Reputation: 293
I have an html document that I need to scrape for certain strings. The document is a youtube playlist. For example:
require 'open-uri'
doc = Nokogiri::HTML(open("https://www.youtube.com/playlist?list=PL11CE9468C379D2C8"))
When I view the HTML source code I can see the string I want.
<tr class="pl-video yt-uix-tile " data-title="Tyler The Creator - Yonkers" data-video-id="XSbZidsgMfw"
The string is what follows data-video-id
in quotations. In this playlist there are 7 videos so there are 7 samples of this code, each with a different data-video-id
. How can I loop through and save each of these strings to a @scraped_id
variable?
The id is saved using
@video = @stream.videos.find_or_initialize_by(url: @scraped_id)
@video.save
Upvotes: 1
Views: 1369
Reputation: 2923
You can use a CSS selector to pick out all elements that have a data-video-id
attribute, and then take the value of that attribute.
doc.css("[data-video-id]").each do |el|
@scraped_id = el.attr('data-video-id')
@video = @stream.videos.find_or_initialize_by(url: @scraped_id)
@video.save
end
Upvotes: 1