Reputation: 11007
This is a case of unfortunate input data, i think.
Given a file as can be found here, how can i preserve necessary whitespace as in the link? When I parse this using the code below, the final row gets compacted by being shifted to the left - and the dates get screwy (february gets 31 days, but december doesn't).
I know the code is doing exactly what I tell it. It is splitting on white space. It should maintain a certain number of rows, but since there are no delimiting characters, I'm not sure how to ask for what I want!
code is as follows:
#!/usr/bin/env ruby
require 'open-uri'
require 'csv'
class MoonDataSeeder
def initialize(year = nil)
@year = year || Time.now.year
end
def seed
convert_to_csv
end
private
def convert_to_csv
CSV.open('test_file', 'wb', :force_quotes => true, :skip_blanks => false) do |csv|
feed_data.lines[-39..-7].each do |row|
csv << row.split
end
end
end
def feed_data
@feed_data ||= open(feed_uri).read
end
def feed_uri
host = "http://aa.usno.navy.mil/cgi-bin/aa_moonill2.pl"
host + "?form=2&year=#{year}&task=00&tz=0&tz_sign=-1"
end
def year
@year
end
end
Upvotes: 0
Views: 205
Reputation: 18813
What you're really doing is parsing fixed-width data, rather than delimited data (well, maybe it used to be tabs, but now it's unhelpful spaces). Try the fixedwidth gem instead.
Or, I'd just do it manually. This works on the lines containing the data:
data = lines.map do |line|
line.strip!
[].tap do |pieces|
pieces << line.slice!(0, 3) # Day
line.slice!(0, 4) # Space
until line.empty?
pieces << line.slice!(0, 4) # Month
line.slice!(0, 5) # Space
end
end.map(&:strip)
end
And just for fun, here's a version using regular expressions
data = lines.map do |line|
line.scan(/([\w. ]{4})( {4,5})?/).map(&:first)
end
Upvotes: 1