Reputation: 300
I am downloading an rss file posted as xml, and saving it with the rss extension. I then use the rss module to read it as an rss file. The issue I have is the following:
If I create the file (page.rss) with an implicit path and I use just that filename to process it with the rss parsing function, everything is fine (downloaded_file = 'page.rss')
If I explicity enter manually the full path into the script (downloaded_file = "E:/Libraries/Documents/Android dev/page.rss"), everything works fine also.
But if I "calculate" the value of the absolute path with: downloaded_file = File.join(Dir.pwd, 'page.rss') the rss function fails. The value of the variable is apparently the same ("E:/Libraries/Documents/Android dev/page.rss") but there must be an invisible difference. I would like to be able to use the 'calculated' absolute path. I am sure there is a subtle difference in the way this string is interpreted by the rss function. How can I elucidate it? Thanks for any suggestion.
Here is my script:
require 'rss'
require 'open-uri'
url = 'http://tutorialspoint.com/android/sampleXML.xml'
downloaded_file = File.join(Dir.pwd, 'page.rss') # FAILS
puts "Path = #{downloaded_file}"#=> "E:/Libraries/Documents/Android dev/page.rss"
downloaded_file = 'page.rss' # WORKS
#downloaded_file = "E:/Libraries/Documents/Android dev/page.rss" # WORKS
puts "Used path/filename: #{downloaded_file}"
File.open(downloaded_file, 'wb') do |file| # Download url content into rss file
file << open(url).read
end
rss = RSS::Parser.parse(downloaded_file, false) # Read rss from downloaded_file
puts "Title: #{rss.channel.title}"
Upvotes: 0
Views: 70
Reputation: 7561
NEW ANSWER
Okay, so your downloaded_file
string has been marked as tainted
, and the RSS::Parser
won't open a tainted file string for some reason (see rss/parser.rb
about l. 105 for more details). The solution is to either: untaint the downloaded_file
string before you call parse
, e.g.:
RSS::Parser.parse(downloaded_file.untaint, false)
or to just open the file for the parser, e.g.:
RSS::Parser.parse(File.open(downloaded_file), false)
I'd never run into this issue before, so thanks! I'd heard of object tainting before, but I never really had any use to look into it. There is a bit more information about it here: What are tainted objects, and when should we untaint them?.
PREVIOUS ANSWER
Dir.pwd
is going to change depending on where you call the script from. Unless you are calling the script from E:/Libraries/Documents/Android dev
, the filepath will be off.
It's better to build your filepath from the location of your script itself. To do so you can add:
ROOT = File.expand_path('..', __FILE__)
downloaded_file = File.join(ROOT, 'page.rss')
# or just downloaded_file = File.expand_path('../page.rss', __FILE__)
Upvotes: 2