Reputation: 719
By default, the #split
method work as follows:
"id,name,title(first_name,last_name)".split(",")
will give you following output:
["id", "name", "title(first_name", "last_name)"]
But I want something like following:
["id", "name", "title(first_name,last_name)"]
So, I use following regex (from the this answer) using split to get desired output:
"id,name,title(first_name,last_name)".split(/,(?![^(]*\))/)
But, again when I use another string, which is my actual input above, the logic fails. My actual string is:
"id,name,title(first_name,last_name,address(street,pincode(id,code)))"
and it is giving following output:
["id", "name", "title(first_name", "last_name", "address(street", "pincode(id,code)))"]
rather than
["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
Upvotes: 4
Views: 866
Reputation: 110675
def doit(str)
split_here = 0.chr
stack = 0
s = str.gsub(/./) do |c|
ret = c
case c
when '('
stack += 1
when ','
ret = split_here, if stack.zero?
when ')'
raise(RuntimeError, "parens are unbalanced") if stack.zero?
stack -= 1
end
ret
end
raise(RuntimeError, "parens are unbalanced, stack at end=#{stack}") if stack > 0
s.split(split_here)
end
doit "id,name,title(first_name,last_name)"
#=> ["id", "name", "title(first_name,last_name)"]
doit "id,name,title(first_name,last_name,address(street,pincode(id,code)))"
#=> ["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
doit "a,b(c(d),e,f)"
#=> ["a", "b(c(d),e,f)"]
doit "id,name,title(first_name,last_name),pub(name,address)"
#=> ["id", "name", "title(first_name,last_name)", "pub(name,address)"]
doit "a,b(c)d),e,f)"
#=> RuntimeError: parens are unbalanced
doit "a,b(c(d),e),f("
#=> RuntimeError: parens are unbalanced, stack at end=["("]
A comma is to be split upon if and only if stack
is zero when it is encountered. If it is to be split upon it is changed to a character (split_here
) that is not in the string. (I used 0.chr
). The string is then split on split_here
.
Upvotes: 3
Reputation: 19
This could be one approach:
"id,name,title(first_name,last_name)".split(",")[0..1] << "id,name,title(first_name,last_name)".split(",")[-2..-1].join
Creating a duplicate string and splitting them both, then combining the first two elements of the first string with the joined last two elements of the second string copy. At least in this specific scenario it would give you the desired result.
Upvotes: -1
Reputation: 3454
Updated Answer
Since the earlier answer didn't take care of all the cases as rightly pointed out in the comments, I'm updating the answer with another solution.
This approach separates the valid commas using a separator |
and, later uses it to split the string using String#split
.
class TokenArrayParser
SPLIT_CHAR = '|'.freeze
def initialize(str)
@str = str
end
def parse
separate_on_valid_comma.split(SPLIT_CHAR)
end
private
def separate_on_valid_comma
dup = @str.dup
paren_count = 0
dup.length.times do |idx|
case dup[idx]
when '(' then paren_count += 1
when ')' then paren_count -= 1
when ',' then dup[idx] = SPLIT_CHAR if paren_count.zero?
end
end
dup
end
end
%w(
id,name,title(first_name,last_name)
id,name,title(first_name,last_name,address(street,pincode(id,code)))
first_name,last_name,address(street,pincode(id,code)),city(name)
a,b(c(d),e,f)
id,name,title(first_name,last_name),pub(name,address)
).each {|str| puts TokenArrayParser.new(str).parse.inspect }
# =>
# ["id", "name", "title(first_name,last_name)"]
# ["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
# ["first_name", "last_name", "address(street,pincode(id,code))", "city(name)"]
# ["a", "b(c(d),e,f)"]
# ["id", "name", "title(first_name,last_name)", "pub(name,address)"]
I'm sure this can be optimized more.
Upvotes: 3