PheurtoSkuerto
PheurtoSkuerto

Reputation: 21

How to make Regex lookahead to match one and two digit numbers?

Let's say for example that I have a string reading "1this12string". I would like to use String#split with regex using lookahead that will give me ["1this", "12string"].

My current statement is (?=\d), which works very well for single digit numbers. I am having trouble modifying this statement to include both 1 and 2 digit numbers.

Upvotes: 1

Views: 1378

Answers (3)

Ryszard Czech
Ryszard Czech

Reputation: 18611

Use

String[] splits = string.split("(?<=\\D)(?=\\d)");

See regex proof

Explanation

--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    \D                       non-digits (all but 0-9)
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
  )                        end of look-ahead

Upvotes: 0

SeaBean
SeaBean

Reputation: 23217

If you really want to use Regex Lookahead, try this:

(\d{1,2}[^\d]*)(?=\d|\b)

Regex Demo

Note that this assume every string split must have 1 or 2 digits at the front. In case this is not the case, please let us know so that we can further enhance it.

Regex Logics

  • \d{1,2} to match 1 or 2 digits at the front
  • [^\d]* to match non-digit characters following the first 1 or 2 digit(s)
  • Enclose the the above 2 segments in parenthesis () so as to make it a capturing group for extraction of matched text.
  • (?=\d to fulfill your requirement to use Regex Lookahead
  • |\b to allow the matching text to be at the end of a text (just before a word boundary)

I think you can also achieve your task with a simpler regex, without using the relatively more sophisticated feature like Regex Lookahead. For example:

\d{1,2}[^\d]*

You can see in the Regex Demo that this works equally well for your sample input. Anyway, in case your requirement is anything more than this, please let us know to fine-tune it.

Upvotes: 1

Bohemian
Bohemian

Reputation: 425053

Add a look behind so you don't split within numbers:

(?<!\d)(?=\d)

See live demo.

Upvotes: 1

Related Questions