Cedric H.
Cedric H.

Reputation: 8288

Ruby data extraction from a text file

I have a relatively big text file with blocks of data layered like this:

ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  0.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 0.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 0.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0

(they contain more lines and then are repeated)

I would like first to extract the numerical value after TUNE X = and output these in a text file. Then I would like to extract the numerical value of LINE FREQUENCY and AMPLITUDE as a pair of values and output to a file.

My question is the following: altough I could make something moreorless working using a simple REGEXP I'm not convinced that it's the right way to do it and I would like some advices or examples of code showing how I can do that efficiently with Ruby.

Upvotes: 6

Views: 4605

Answers (4)

the Tin Man
the Tin Man

Reputation: 160551

There are lots of ways to do it. This is a simple first pass at it:

text = 'ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  0.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 0.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 0.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0

ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  1.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 1.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 1.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0

ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  2.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 2.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 2.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0
'

require 'stringio'
pretend_file = StringIO.new(text, 'r')

That gives us a StringIO object we can pretend is a file. We can read from it by lines.

I changed the numbers a bit just to make it easier to see that they are being captured in the output.

pretend_file.each_line do |li|
  case

  when li =~ /^TUNE.+?=\s+(.+)/
    print $1.strip, "\n"

  when li =~ /^\d+\s+(\S+)\s+(\S+)/
    print $1, ' ', $2, "\n"

  end
end

For real use you'd want to change the print statements to a file handle: fileh.print

The output looks like:

# >> 0.2561890123390808
# >> 0.2561890123391E+00 0.204316425208E-01
# >> 0.2562865535359E+00 0.288712798671E-01
# >> 1.2561890123390808
# >> 1.2561890123391E+00 0.204316425208E-01
# >> 1.2562865535359E+00 0.288712798671E-01
# >> 2.2561890123390808
# >> 2.2561890123391E+00 0.204316425208E-01
# >> 2.2562865535359E+00 0.288712798671E-01

Upvotes: 1

fl00r
fl00r

Reputation: 83680

file = File.open("data.dat")
@tune_x = @frequency = @amplitude = []
file.each_line do |line|
  tune_x_scan = line.scan /TUNE X =  (\d*\.\d*)/
  data_scan = line.scan /(\d*\.\d*E[-|+]\d*)/
  @tune_x << tune_x_scan[0] if tune_x_scan
  @frequency << data_scan[0] if data_scan
  @amplitude << data_scan[0] if data_scan
end

Upvotes: 1

kurumi
kurumi

Reputation: 25599

Generally, (not tested)

toggle=0
File.open("file").each do |line|
    if line[/TUNE/]
        puts line.split("=",2)[-1].strip
    end
    if line[/Line Frequency/]
        toggle=1
        next
    end
    if toggle
        a = line.split
        puts "#{a[1]} #{a[2]}"
    end
end

go through the file line by line, check for /TUNE/, then split on "=" to get last item. Do the same for lines containing /Line Frequency/ and set the toggle flag to 1. This signify that the rest of line contains the data you want to get. Since the freq and amplitude are at fields 2 and 3, then split on the lines and get the respective positions. Generally, this is the idea. As for toggling, you might want to set toggle flag to 0 at the next block using a pattern (eg SIGNAL CASE or ANALYSIS)

Upvotes: 3

Kir
Kir

Reputation: 8111

You can read your file line by line and cut each by number of symbol, for example:

  • to extract tune x get symbols from 10 till 27 on line 2
  • to extract LINE FREQUENCY get symbols from 3 till 22 on line 6+n

Upvotes: 0

Related Questions