Henley Wing Chiu
Henley Wing Chiu

Reputation: 22515

Regex to get everything before and after last occurrence of a number

I'm trying to construct a regex that will split before and after the last occurrence of a number. I expect to get:

"index100.html"         # => ["index", "100", ".html"]
"page.php?id=100"       # => ["page.php?id=", "100", ""]
"page.php?f=5&page=295" # => ['page.php?f=5&page=', 295, '']

Here is the regex I came up with:

regex = /([^0-9]+|^)(\d+?)([^0-9]+|$)/

It works for the first two examples, but not for the last one. I get the result:

["page.php?f=", 5, "&page="]

How can I modify the regex to make it work the the third case?

Upvotes: 0

Views: 370

Answers (4)

Cary Swoveland
Cary Swoveland

Reputation: 110685

def split_it(str)
  str.reverse.partition(/\d+/).reverse.map(&:reverse)
end

split_it "index100.html"
  #=> ["index", "100", ".html"]
split_it "page.php?id=100"
  #=> ["page.php?id=", "100", ""]
split_it "page.php?f=5&page=295"
  #=> ['page.php?f=5&page=', 295, '']

The steps for

str = "page.php?f=5&page=295"

are as follows:

s = str.reverse
  #=> "592=egap&5=f?php.egap" 
a = s.partition(/\d+/)
  #=> ["", "592", "=egap&5=f?php.egap"] 
b = a.reverse
  #=> ["=egap&5=f?php.egap", "592", ""] 
b.map(&:reverse)
  #=> ["page.php?f=5&page=", "295", ""] 

Upvotes: 1

user557597
user557597

Reputation:

Another way without lookbehind.

((?:\d*\D)*)(\d+)(.*)

another without lookbehind
(this is just as fast as using a lookbehind
but if you don't have it, like JS, this works better
)

(.*(?:\D|^))(\d+)(.*)

Upvotes: 3

sawa
sawa

Reputation: 168101

"index100.html"
.partition(/\d+(?=\D*\z)/) # => ["index", "100", ".html"]

"page.php?id=100"
.partition(/\d+(?=\D*\z)/) # => ["page.php?id=", "100", ""]

"page.php?f=5&page=295"
.partition(/\d+(?=\D*\z)/) # => ["page.php?f=5&page=", "295", ""]

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626893

You may leverage a .* greedy matching, but curb it with a negative lookbehind (?<!\d) to make sure you match the whole last chunk of digits:

/(.*)(?<!\d)(\d+)(.*)/
 ^^^^^^^^^^^      

See the regex demo. Optionally, you may add \A and \z anchors at the start and end.

Details:

  • (.*) - 0 or more characters other than a newline, as many as possible, matching up to the last
  • (?<!\d)(\d+) - 1+ digits that are NOT preceded with a digit
  • (.*) - the rest of the line.

To match across newlines, add the m modifier after the last regex delimiter.

Upvotes: 3

Related Questions