Parag
Parag

Reputation: 963

Correct Regular Expression match containing optional substrings

I have the following set of strings:

some_param[name] 
some_param_0[name]

I wish to capture some_param, 0, name from them. My regex knowledge is pretty weak. I tried the following, but it doesn't work for both cases.

/^(\D+)_?(\d{0,2})\[?(.*?)\]?$/.exec("some_param_0[name]") //works except for the trailing underscore on "some_param"

What would be the correct regex?

Upvotes: 0

Views: 1155

Answers (3)

Mike Samuel
Mike Samuel

Reputation: 120506

/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/

(\w+?) uses a non-greedy quantifier to capture the identifier part without any trailing _.

_? is greedy so will beat the +? in the previous part.

(\d{0,2}) will capture 0-2 digits. It is greedy, so even if there is no _ between the identifier and digits, this will capture digits.

(?:...)? makes the square bracketed section optional.

\[([^\[\]]*)\] captures the contents of a square bracketed section that does not itself contain square brackets.

'some_param_0[name]'.match(/^(\w+?)_(\d{0,2})(?:\[([^\[\]]*)\])?$/)

produces an array like:

["some_param_0[name]",  // The matched content in group 0.
 "some_param",          // The portion before the digits in group 1.
 "0",                   // The digits in group 2.
 "name"]                // The contents of the [...] in group 3.

Note that the non-greedy quantifier might interact strangely with the bounded repetition in \d{0,2}.

'x1234[y]'.match(/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/)

yields

["x1234[y]","x12","34","y"]

Upvotes: 3

redShadow
redShadow

Reputation: 6777

Got it! (taking from Mike's answer):

/^(\D+)(?:_(\d+))?(?:\[([^\]]*)\])/

'some_param[name]' => ('some_param', None, 'name')
'some_param_0[name]' => ('some_param', '0', 'name')

(at least, in Python it works)

UPDATE: A little extra I wrote fiddling with it, by making the result cleaner by using named groups:

^(?P<param>\D+)(?:_(?P<id>\d+))?(?:\[(?P<key>[^\]]*)\])

UPDATE:

Upvotes: 1

Mohamed Magdy
Mohamed Magdy

Reputation: 47

Please ,check the follwing regexp "(\w+)_(\d)[(\w+)]" yo can test it @ http://rubular.com/

Upvotes: 0

Related Questions