jen
jen

Reputation: 300

ruby rss module not reading full path

I am downloading an rss file posted as xml, and saving it with the rss extension. I then use the rss module to read it as an rss file. The issue I have is the following:

Here is my script:

require 'rss'   
require 'open-uri'

url = 'http://tutorialspoint.com/android/sampleXML.xml'

downloaded_file = File.join(Dir.pwd, 'page.rss')                 # FAILS

puts "Path = #{downloaded_file}"#=> "E:/Libraries/Documents/Android dev/page.rss"
downloaded_file = 'page.rss'                                     # WORKS
#downloaded_file = "E:/Libraries/Documents/Android dev/page.rss" # WORKS
puts "Used path/filename: #{downloaded_file}"

File.open(downloaded_file, 'wb') do |file|  # Download url content into rss file
  file << open(url).read 
end 

rss = RSS::Parser.parse(downloaded_file, false)  # Read rss from downloaded_file                                 
puts "Title: #{rss.channel.title}"

Upvotes: 0

Views: 70

Answers (1)

Jacob Brown
Jacob Brown

Reputation: 7561

NEW ANSWER

Okay, so your downloaded_file string has been marked as tainted, and the RSS::Parser won't open a tainted file string for some reason (see rss/parser.rb about l. 105 for more details). The solution is to either: untaint the downloaded_file string before you call parse, e.g.:

RSS::Parser.parse(downloaded_file.untaint, false)

or to just open the file for the parser, e.g.:

RSS::Parser.parse(File.open(downloaded_file), false)

I'd never run into this issue before, so thanks! I'd heard of object tainting before, but I never really had any use to look into it. There is a bit more information about it here: What are tainted objects, and when should we untaint them?.

PREVIOUS ANSWER

Dir.pwd is going to change depending on where you call the script from. Unless you are calling the script from E:/Libraries/Documents/Android dev, the filepath will be off.

It's better to build your filepath from the location of your script itself. To do so you can add:

ROOT = File.expand_path('..', __FILE__)
downloaded_file = File.join(ROOT, 'page.rss')
# or just downloaded_file = File.expand_path('../page.rss', __FILE__)

Upvotes: 2

Related Questions