kay85
kay85

Reputation: 151

Ruby - parsing a text file

I am pretty new to Ruby and have been trying some really basic text parsing. I am now however trying to parse a little bit more of a complicated file and then push it out into a csv file (which I havent done before) and am getting quite stuck.

The file looks as follows,

Title
some text
some different text
Publisher: name
Published Date: date
Number1: number
Number2: number
Number3: number
Category: category
----------------------
Title
some text
some different text
Publisher: name
Published Date: date
Number1: number
Number2: number
Number3: number
Category: category
----------------------

etc.

Each line would represent a new "column" in the csv.

Would anyone please be able to lend a hand?

Thank you so much!

Upvotes: 15

Views: 40921

Answers (3)

Justin Vallely
Justin Vallely

Reputation: 6089

Ruby 2.6.5

some_file = "/path/to/file.extension"

if File.exist?(some_file)
    File.open(some_file).each do |line|
        if line.include?('some_string')
            puts "line: #{line}"
        end
    end
end

Upvotes: 0

Elad
Elad

Reputation: 3130

Here's one complete solution. Note that it's very sensitive to the file structure!

out_file = File.open('your_csv_file.csv', 'w')
out_file.puts "Title,Publisher,Publishedate,Number1,Number2,Number3,Category"
the_line = []
in_title = false
IO.foreach('your_file_name') do |line|
  if line =~ /^-+$/
    out_file.puts the_line.join(',')
    the_line = []
  elsif line =~ /^Title$/
    in_title = true
  elsif line =~ /^(?:Publishe(?:r|d Date)|Number\d|Category):\s+(.*?)$/
    the_line += [$1]
    in_title = false
  elsif in_title
    the_line[0] = (the_line.empty? ?  line.chomp : "\"#{the_line[0]} #{line.chomp}\"")
  else
    puts "Error: don't know what to do with line #{line}"
  end
end
out_file.close

Upvotes: 5

kurumi
kurumi

Reputation: 25599

Here's a general idea for you to start with

File.open( thefile ).each do |line|
    print line without the new line if line does not contain  /--+/
    if line contains /--+/
        print line with a new line
    end
end

Upvotes: 26

Related Questions