Reputation: 1200
I'm not good enough with RegEx yet. I've been searching around and trying to write my own, and haven't succeeded. I want to search through a string
Shelf-15-Contains(Item)10-Depo91
I want to search for (), which can be done by
/\(([^()]+)\)/g
When the RegEx finds () I want to grab the 'stuff' that comes right before the (), the () and everything inside, and then whatever follows directly afterwards. So,
Contains(Item)10
EDIT: Also, the RegEx I have above makes sure that there aren't nested (), so once I figure out how to match what comes before and after I should be able to run this recursively if there were multiple nested layers?
Upvotes: 0
Views: 62
Reputation: 626728
IMHO, no need to overcomplicate here. Here is a regex that will match Contains
, everything in the brackets (with or without nested ones, balanced or not), and the optional digits. It assumes that there are -
s around this construction:
\w+\(.*?\)\d*(?=-|$)
See demo
Input:
Shelf-15-Contains(I(t)e(m))10-Depo91
Shelf-15-Contains(I(t)e(m))-Depo91
Matches:
Contains(I(t)e(m))10
Contains(I(t)e(m))
Upvotes: 0
Reputation: 650
For matching before and after, use additional capturing groups:
while (
$str
=~ m/
([^-]*) # before
\( ( [^()]* ) \) # (in)
(?= ([^-]*) ) # after
/gx
) {
my ($before, $in, $after) = ($1, $2, $3);
...
}
Nested constructs cannot be recognized by regular expressions in the strict sense (finite state machine accepting a string). Perl's regex engine offer additional constructions for recognizing balanced parentheses, but they are difficult rather to use.
http://perldoc.perl.org/perlre.html#Extended-Patterns gives examples how to parse balanced parentheses, at (??{ code })
and (?PARNO)
.
Finally, the structure of the string you want to parse seems to be a -
-separated list. Try to find a formal grammar for what you want to parse, it will help you to design your program.
If you don't need to handle a(b)c(d)e
, then you can simplify (?= ([^-]*) )
to ([^-]*)
.
Upvotes: 1