Reputation: 21
I have a url that contains many zip files that I need to download local copies of them. I have so far:
require 'open-uri'
require 'pry'
def download_xml(url, dest)
open(url) do |u|
File.open(dest, 'wb') { |f| f.write(u.read) }
end
end
urls = ["http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/"]
urls.each { |url| download_xml(url, url.split('/').last) }
However, I can't seem to access the zip files that are at that location or loop through them. How would I loop through each zip file at the end of that URL so that they can be accessed in that array and downloaded by the method?
Upvotes: 2
Views: 3402
Reputation: 6613
I have used Nokogiri gem to parse HTML, so first install Nokogiri gem:
sudo apt-get install build-essential patch
sudo apt-get install ruby-dev zlib1g-dev liblzma-dev
sudo gem install nokogiri
Solution that specific to your problem:
noko.rb
require 'rubygems'
require 'nokogiri'
require 'open-uri'
page = Nokogiri::HTML(open("http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/")) # Open web address with Nokogiri
puts page.class # => Nokogiri::HTML::Documents
for file_link in page.css('a') # For each a HTML tag / link
if file_link.text[-4,4] != ".zip" # If it's not a zip file
next # Continue the loop
end
link = "http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/" + file_link.text # Generate the zip file's link
puts link
open(file_link.text, 'wb') do |file|
file << open(link).read # Save the zip file to this directory
end
puts file_link.text + " has been downloaded."
end
I have explained the code with comments.
Eventually, there is no choice but parsing the HTML file and generating download links one by one and download at the end.
Upvotes: 1