Reputation: 101476
How do I construct an array of different types given a comma-separated string and another array dictating the type?
By parsing CSV input taken from stdin
, I have an array of column header Symbol
s:
cols = [:IndexSymbol, :PriceStatus, :UpdateExchange, :Last]
and a line of raw input:
raw = "$JX.T.CA,Open,T,933.36T 11:10:00.000"
I would like to construct an an array, cells
from the raw
input, where each element of cells
is a type identified by the corresponding element in cols
. What are the idiomatic Ruby-sh ways of doing this?
I have tried this, which works but doesn't really feel right.
1) First, define a class for each type which needs to be encapsulated:
class Sku
attr_accessor :mRoot, :mExch,, :mCountry
def initialize(root, exch, country)
@mRoot = root
@mExch = exch
@mCountry = country
end
end
class Price
attr_accessor :mPrice, :mExchange, :mTime
def initialize(price, exchange, time)
@mPrice = price
@mExchange = exchange
@mTime = time
end
end
2) Then, define conversion functions for each unique column type which needs to be converted:
def to_sku(raw)
raw.match('(\w+)\.(\w{0,1})\.(\w{,2})') { |m| Sku.new(m[1], m[2], m[3])}
end
def to_price(raw)
end
3) Create an array of strings from the input:
cells = raw.split(",")
4) And finally modify each element of cells
in-place by constructing the type dictated by the corresponding column header:
cells.each_index do |i|
cells[i] = case cols[i]
when :IndexSymbol
to_sku(cells[i])
when :PriceStatus
cells[i].split(";").collect {|st| st.to_sym}
when :UpdateExchange
cells[i]
when :Last
cells[i].match('(\d*\.*\d*)(\w?) (\d{1,2}:\d{2}:\d{2}\.\d{3})') { |m| Price.new(m[1], m[2], m[3])}
else
puts "Unhandled column type (#{cols[i]}) from input string: \n#{cols}\n#{raw}"
exit -1
end
end
The parts that don't feel right are steps 3 and 4. How is this done in a more Ruby fashion? I was imagining some kind of super concise method like this, which exists only in my imagination:
cells = raw.split_using_convertor(",")
Upvotes: 2
Views: 134
Reputation: 31594
You could have the different types inherit from a base class and put the lookup knowledge in that base class. Then you could have each class know how to initialize itself from a raw string:
class Header
@@lookup = {}
def self.symbol(*syms)
syms.each{|sym| @@lookup[sym] = self}
end
def self.lookup(sym)
@@lookup[sym]
end
end
class Sku < Header
symbol :IndexSymbol
attr_accessor :mRoot, :mExch, :mCountry
def initialize(root, exch, country)
@mRoot = root
@mExch = exch
@mCountry = country
end
def to_s
"@#{mRoot}-#{mExch}-#{mCountry}"
end
def self.from_raw(str)
str.match('(\w+)\.(\w{0,1})\.(\w{,2})') { |m| new(m[1], m[2], m[3])}
end
end
class Price < Header
symbol :Last, :Bid
attr_accessor :mPrice, :mExchange, :mTime
def initialize(price, exchange, time)
@mPrice = price
@mExchange = exchange
@mTime = Time.new(time)
end
def to_s
"$#{mPrice}-#{mExchange}-#{mTime}"
end
def self.from_raw(raw)
raw.match('(\d*\.*\d*)(\w?) (\d{1,2}:\d{2}:\d{2}\.\d{3})') { |m| new(m[1], m[2], m[3])}
end
end
class SymbolList
symbol :PriceStatus
attr_accessor :mSymbols
def initialize(symbols)
@mSymbols = symbols
end
def self.from_raw(str)
new(str.split(";").map(&:to_sym))
end
def to_s
mSymbols.to_s
end
end
class ExchangeIdentifier
symbol :UpdateExchange
attr_accessor :mExch
def initialize(exch)
@mExch = exch
end
def self.from_raw(raw)
new(raw)
end
def to_s
mExch
end
end
Then you can replace step #4 like so (CSV parsing not included):
cells.each_index.map do |i|
Header.lookup(cols[i]).from_raw(cells[i])
end
Upvotes: 2
Reputation: 101476
@AbeVoelker's answer steered me in the right direction, but I had to make a pretty major change because of something I failed to mention in the OP.
Some of the cells will be of the same type, but will still have different semantics. Those semantic differences don't come in to play here (and aren't elaborated on), but they do in the larger context of the tool I'm writing.
For example, there will be several cells that are of type Price
; some of them are :Last
, ':Bid
, and :Ask
. They are all the same type (Price
), but they are still different enough so that there can't be a single Header@@lookup
entry for all Price
columns.
So what I actually did was write a self-decoding class (credit to Abe for this key part) for each type of cell:
class Sku
attr_accessor :mRoot, :mExch, :mCountry
def initialize(root, exch, country)
@mRoot = root
@mExch = exch
@mCountry = country
end
def to_s
"@#{mRoot}-#{mExch}-#{mCountry}"
end
def self.from_raw(str)
str.match('(\w+)\.(\w{0,1})\.(\w{,2})') { |m| new(m[1], m[2], m[3])}
end
end
class Price
attr_accessor :mPrice, :mExchange, :mTime
def initialize(price, exchange, time)
@mPrice = price
@mExchange = exchange
@mTime = Time.new(time)
end
def to_s
"$#{mPrice}-#{mExchange}-#{mTime}"
end
def self.from_raw(raw)
raw.match('(\d*\.*\d*)(\w?) (\d{1,2}:\d{2}:\d{2}\.\d{3})') { |m| new(m[1], m[2], m[3])}
end
end
class SymbolList
attr_accessor :mSymbols
def initialize(symbols)
@mSymbols = symbols
end
def self.from_raw(str)
new(str.split(";").collect {|s| s.to_sym})
end
def to_s
mSymbols.to_s
end
end
class ExchangeIdentifier
attr_accessor :mExch
def initialize(exch)
@mExch = exch
end
def self.from_raw(raw)
new(raw)
end
def to_s
mExch
end
end
...Create a typelist, mapping each column identifier to the type:
ColumnTypes =
{
:IndexSymbol => Sku,
:PriceStatus => SymbolList,
:UpdateExchange => ExchangeIdentifier,
:Last => Price,
:Bid => Price
}
...and finally construct my Array
of cells by calling the appropriate type's from_raw
:
cells = raw.split(",").each_with_index.collect { |cell,i|
puts "Cell: #{cell}, ColType: #{ColumnTypes[cols[i]]}"
ColumnTypes[cols[i]].from_raw(cell)
}
The result is code that is clean and expressive in my eyes, and seems more Ruby-ish that what I had originally done.
Complete example here.
Upvotes: 1
Reputation: 79783
Ruby’s CSV library includes support for this sort of thing directly (as well as better handling of the actual parsing), although the docs are a bit awkward.
You need to provide a proc
that will do your conversions for you, and pass it as an option to CSV.parse
:
converter = proc do |field, info|
case info.header.strip # in case you have spaces after your commas
when "IndexSymbol"
field.match('(\w+)\.(\w{0,1})\.(\w{,2})') { |m| Sku.new(m[1], m[2], m[3])}
when "PriceStatus"
field.split(";").collect {|st| st.to_sym}
when "UpdateExchange"
field
when "Last"
field.match('(\d*\.*\d*)(\w?) (\d{1,2}:\d{2}:\d{2}\.\d{3})') { |m| Price.new(m[1], m[2], m[3])}
end
end
Then you can parse it almost directly into the format you want:
c = CSV.parse(s, :headers => true, :converters => converter).by_row!.map do |row|
row.map { |_, field| f } #we only want the field now, not the header
end
Upvotes: 1
Reputation: 30408
You can make the fourth step simpler with #zip
, #map
, and destructuring assignment:
cells = cells.zip(cols).map do |cell, col|
case col
when :IndexSymbol
to_sku(cell)
when :PriceStatus
cell.split(";").collect {|st| st.to_sym}
when :UpdateExchange
cell
when :Last
cell.match('(\d*\.*\d*)(\w?) (\d{1,2}:\d{2}:\d{2}\.\d{3})') { |m| Price.new(m[1], m[2], m[3])}
else
puts "Unhandled column type (#{col}) from input string: \n#{cols}\n#{raw}"
exit -1
end
end
I wouldn’t recommend combining that step with the splitting, because parsing a line of CSV is complicated enough to be its own step. See my comment for how to parse the CSV.
Upvotes: 2