user6731064
user6731064

Reputation: 331

Find nth occurrence of variable regex in Ruby?

Writing a method for what the question says, need to find the index of the nth occurrence of a particular left bracket (defined by the user, i.e. if user provides a string with the additional parameters '{' and '5' it will find the 5th occurrence of this, same with '(' and '[').

Currently doing it with a while loop and comparing each character but this looks ugly and isn't very interesting, is there a way to do this with regex? Can you use a variable in a regex?

def _find_bracket_n(str,left_brac,brackets_num)
  i = 0
  num_of_left_bracs = 0
  while i < str.length && num_of_left_bracs < brackets_num
    num_of_left_bracs += 1 if str[i] == left_brac
    i += 1
  end
  n_th_lbrac_index = i - 1
end

Upvotes: 1

Views: 1398

Answers (1)

Cary Swoveland
Cary Swoveland

Reputation: 110685

The offset of the nth instance of a given character in a string is wanted, or nil if the string contains fewer than n instances of that character. I will give four solutions.

chr = "("
str = "a(b(cd((ef(g(hi(" 
n = 5

Use Enumerable#find_index

str.each_char.find_index { |c| c == chr && (n = n-1).zero? }
  #=> 10

Use a regular expression

chr_esc = Regexp.escape(chr)
  #=> "\\("

r = /
    \A           # match the beginning of the string
    (?:          # begin a non-capture group
      .*?        # match zero or more characters lazily
      #{chr_esc} # match the given character
    )            # end the non-capture group
    {#{n-1}}     # perform the non-capture group `n-1` times
    .*?          # match zero or more characters lazily
    #{chr_esc}   # match the given character
    /x           # free-spacing regex definition mode
#=> /
    \A           # match the beginning of the string
    (?:          # begin a non-capture group
      .*?        # match zero or more characters lazily
      \(         # match the given character
    )            # end the non-capture group
    {4}          # perform the non-capture group `n-1` times
    .*?          # match zero or more characters lazily
    \(           # match the given character
    /x

str =~ r
  #=> 0
$~.end(0)-1
  #=> 10

For the last line we could instead write

Regexp.last_match.end(0)-1

See Regexp::escape, Regexp::last_match and MatchData#end.

The regex is conventionally written (i.e., not free-spacing mode) written as follows.

/\A(?:.*?#{chr_esc}){#{n-1}}.*?#{chr_esc}/

Convert characters to offsets, remove offsets to non-matching characters and return the nth offset of those that remain

str.size.times.select { |i| str[i] == chr }[n-1]
  #=> 10
n = 20
str.size.times.select { |i| str[i] == chr }[n-1]
  #=> nil

Use String#index repeatedly to decapitate substrings

s = str.dup
n.times.reduce(0) do |off,_| 
  i = s.index(chr)
  break nil if i.nil?
  s = s[i+1..-1]      
  off + i + 1
end - 1
  #=> 10

Upvotes: 6

Related Questions