Reputation: 151
I am pretty new to Ruby and have been trying some really basic text parsing. I am now however trying to parse a little bit more of a complicated file and then push it out into a csv file (which I havent done before) and am getting quite stuck.
The file looks as follows,
Title
some text
some different text
Publisher: name
Published Date: date
Number1: number
Number2: number
Number3: number
Category: category
----------------------
Title
some text
some different text
Publisher: name
Published Date: date
Number1: number
Number2: number
Number3: number
Category: category
----------------------
etc.
Each line would represent a new "column" in the csv.
Would anyone please be able to lend a hand?
Thank you so much!
Upvotes: 15
Views: 40921
Reputation: 6089
Ruby 2.6.5
some_file = "/path/to/file.extension"
if File.exist?(some_file)
File.open(some_file).each do |line|
if line.include?('some_string')
puts "line: #{line}"
end
end
end
Upvotes: 0
Reputation: 3130
Here's one complete solution. Note that it's very sensitive to the file structure!
out_file = File.open('your_csv_file.csv', 'w')
out_file.puts "Title,Publisher,Publishedate,Number1,Number2,Number3,Category"
the_line = []
in_title = false
IO.foreach('your_file_name') do |line|
if line =~ /^-+$/
out_file.puts the_line.join(',')
the_line = []
elsif line =~ /^Title$/
in_title = true
elsif line =~ /^(?:Publishe(?:r|d Date)|Number\d|Category):\s+(.*?)$/
the_line += [$1]
in_title = false
elsif in_title
the_line[0] = (the_line.empty? ? line.chomp : "\"#{the_line[0]} #{line.chomp}\"")
else
puts "Error: don't know what to do with line #{line}"
end
end
out_file.close
Upvotes: 5
Reputation: 25599
Here's a general idea for you to start with
File.open( thefile ).each do |line|
print line without the new line if line does not contain /--+/
if line contains /--+/
print line with a new line
end
end
Upvotes: 26