Daniel Glover
Daniel Glover

Reputation: 21

Download files from URL with Ruby

I have a url that contains many zip files that I need to download local copies of them. I have so far:

require 'open-uri'
require 'pry'

def download_xml(url, dest)
  open(url) do |u|
    File.open(dest, 'wb') { |f| f.write(u.read) }
  end
end

urls = ["http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/"]

urls.each { |url| download_xml(url, url.split('/').last) }

However, I can't seem to access the zip files that are at that location or loop through them. How would I loop through each zip file at the end of that URL so that they can be accessed in that array and downloaded by the method?

Upvotes: 2

Views: 3402

Answers (1)

mertyildiran
mertyildiran

Reputation: 6613

I have used Nokogiri gem to parse HTML, so first install Nokogiri gem:

sudo apt-get install build-essential patch
sudo apt-get install ruby-dev zlib1g-dev liblzma-dev
sudo gem install nokogiri

Solution that specific to your problem:

noko.rb

require 'rubygems'
require 'nokogiri'
require 'open-uri'

page = Nokogiri::HTML(open("http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/")) # Open web address with Nokogiri
puts page.class   # => Nokogiri::HTML::Documents

for file_link in page.css('a') # For each a HTML tag / link
  if file_link.text[-4,4] != ".zip" # If it's not a zip file
    next # Continue the loop
  end
  link = "http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/" + file_link.text # Generate the zip file's  link
  puts link
  open(file_link.text, 'wb') do |file|
    file << open(link).read # Save the zip file to this directory
  end
  puts file_link.text + " has been downloaded."
end

I have explained the code with comments.

Eventually, there is no choice but parsing the HTML file and generating download links one by one and download at the end.

Upvotes: 1

Related Questions