Seth
Seth

Reputation: 364

Ruby: Using an array list in order to select specific columns

I'm new in Ruby. Here the script, I would like to use the selector in line 10 instead of fields[0] etc... How can I do that ?

For the example the data are embedded. Don't hesitate to correct me if I'm doing wrong when I'm opening or writing a file or anything else, I like to learn.

#!/usr/bin/ruby

filename = "/tmp/log.csv"

selector = [0, 3, 5, 7]

out = File.open(filename + ".rb.txt", "w")
DATA.each_line do |line|
        fields = line.split("|")
        columns = fields[0], fields[3], fields[5], fields[7]
        puts columns.join("|")
        out.puts(columns.join("|"))
end
out.close


__END__
20180704150930|rtsp|645645643|30193|211|KLM|KLM00SD624817.ts|172.30.16.34|127299264|VERB|01780000|21103|277|server01|OK
20180704150931|api|456456546|30130|234|VC3|VC300179201139.ts|172.30.16.138|192271838|VERB|05540000|23404|414|server01|OK
20180704150931|api|465456786|30154|443|BAD|BAD004416550.ts|172.30.16.50|280212202|VERB|04740000|44301|18|server01|OK
20180704150931|api|5437863735|30157|383|VSS|VSS0011062009.ts|172.30.16.66|312727922|VERB|05700000|38303|381|server01|OK
20180704150931|api|3453432|30215|223|VAE|VAE00TF548197.ts|172.30.16.74|114127126|VERB|05060000|22305|35|server01|OK
20180704150931|api|312121|30044|487|BOV|BOVVAE00549424.ts|172.30.16.58|69139448|VERB|05300000|48708|131|server01|OK
20180704150931|rtsp|453432123|30127|203|GZD|GZD0900032066.ts|172.30.16.58|83164150|VERB|05460000|20303|793|server01|OK
20180704150932|api|12345348|30154|465|TYH|TYH0011224259.ts|172.30.16.50|279556843|VERB|04900000|46503|241|server01|OK
20180704150932|api|4343212312|30154|326|VAE|VAE00TF548637.ts|172.30.16.3|28966797|VERB|04740000|32601|969|server01|OK
20180704150932|api|312175665|64530|305|TTT|TTT000000011852.ts|172.30.16.98|47868183|VERB|04740000|30501|275|server01|OK

Upvotes: 2

Views: 339

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110745

Let's begin with a more manageable example. First note that if your string is held by the variable data, each line of the string contains the same number (14) of vertical bars ('|'). Lets reduce that to the first 4 lines of data with each line terminated immediately before the 6th vertical bar:

str = data.each_line.map { |line| line.split("|").first(6).join("|") }.first(4).join("\n")
puts str
20180704150930|rtsp|645645643|30193|211|KLM
20180704150931|api|456456546|30130|234|VC3
20180704150931|api|465456786|30154|443|BAD
20180704150931|api|5437863735|30157|383|VSS

We need to also modify selector (arbitrarily):

selector = [0, 3, 4]

Now on to answering the question.

There is no need to divide the string into lines, split each line on the vertical bars, select the elements of interest from the resulting array, join the latter with a vertical bar and then lastly join the whole shootin' match with a newline (whew!). Instead, simply use String#gsub to remove all unwanted characters from the string.

terms_per_row = str.each_line.first.count('|') + 1
  #=> 6
r = /
    (?:^|\|)  # match the beginning of a line or a vertical bar in a non-capture group
    [^|\n|]+  # match one or more characters other than a vertical bar or newline
    /x        # free-spacing regex definition mode

line_idx = -1
new_str = str.gsub(r) do |s|
  line_idx += 1
  selector.include?(line_idx % terms_per_row) ? s : ''
end
puts new_str
20180704150930|30193|211
20180704150931|30130|234
20180704150931|30154|443
20180704150931|30157|383

Lastly, we write new_str to file:

File.write(fname, new_str)

Upvotes: 2

dimitry_n
dimitry_n

Reputation: 3019

You can get fields at specific indices using Ruby's splat operator (search for 'splat') and Array.values_at like so:

columns = fields.values_at(*selector)

A couple of coding style suggestions:

1.You may want to make selector a constant since its unlikely that you'll want to mutate it further down in your code base

2.The out and out.close and appending to DATA can all be condensed into a CSV.open:

CSV.open(filenname, 'wb') do |csv|
  columns.map do |col|
    csv << col
  end
end

You can also specify a custom delimiter (pipe | in your case) as noted in this answer like so:

...
  CSV.open(filenname, 'wb', {col_sep: '|') do |csv|
...

Upvotes: 5

Related Questions