DOS
DOS

Reputation: 557

Ruby split pipes in regex

I have put the data from a file into an array, then I am just staying with the data I want of that array which looks like follows:

Basically what I want, is to access each column independently. As the file will keep changing I don't want something hard coded, I would have done it already :).

Element0: | data | address | type | source | disable |

Element1: | 0x000001 | 0x123456 | in | D | yes |

Element2: | 0x0d0f00 | 0xffffff | out | M | yes |

Element3: | 0xe00ab4 | 0xaefbd1 | in | E | no |

I have tried with the regexp /\|\s+.*\s+\|/it prints just few lines (it removes the data I care of). I also tried with /\|.*\|/ and it prints all empty. I have googled the split method and I know that this is happening it is because of the .* removing the data I care of. I have also tried with the regexp \|\s*\| but it prints the whole line. I have tried with many regexp's but at this moment I can't think of a way to solve this. Any recommendation?

`line_ary = ary_element.split(/\|\s.*\|/)
    unless  line_ary.nil?  puts line_ary`

Upvotes: 1

Views: 519

Answers (3)

krock
krock

Reputation: 29619

You should use the csv class instead of trying to regex parse it. Something like this will do:

require 'csv'
data = CSV.read('data.csv', 'r', col_sep: '|')

You can access rows and columns as a 2 dimentional array, e.g. to access row 2, column 4: data[1][3].

If for example you just wanted to print the address column for all rows you could do this instead:

CSV.foreach('data.csv', col_sep: '|') do |row|
    puts row[2]
end

Upvotes: 5

florian
florian

Reputation: 36

Split together with strip might be the easiest option. Have you tried something like this?

"Element3:...".split(/\|/).collect(&:strip)

Upvotes: 0

mu is too short
mu is too short

Reputation: 434625

I'd probably use a CSV parser for this but if you want to use a regex and you're sure that you'll never have | inside one of the column values, then you want to say:

row = line.split(/\s*\|\s*/)

so that the whitespace on either side of the pipe becomes part of the delimiter. For example:

> 'Element0: |     data     | address  | type | source | disable |'.split(/\s*\|\s*/)
 => ["Element0:", "data", "address", "type", "source", "disable"] 
> 'Element1: |   0x000001   | 0x123456 |  in  |    D   |   yes   |'.split(/\s*\|\s*/)
 => ["Element1:", "0x000001", "0x123456", "in", "D", "yes"] 

Upvotes: 1

Related Questions