Luca Borrione
Luca Borrione

Reputation: 17032

RegExp: How to remove leading and trailing groups if present from a string

I am executing a regular expression against a long string, capturing portions of it. One of this portion is between quotes and it can have any number of subportions delimited by slash, such as:

'george'
'paul/john'
'john/peter/charles'
...

the subportions are unknown and can be in any order.

I need to retrieve the string between the quotes, but also I would like to be able to remove unwanted leading and trailing groups while executing it.

For example, if the string starts with bruce or bongo, I want to remove it

'bruce/peter/marc'      -> peter/marc
'bongo/bob/kevin/chris' -> bob/kevin/chris

However if the strings starts with anything else, then I want to keep it

'alfie/george/paul'         -> alfie/george/paul

Only one word in the group can be present at at time, in the example above only bruce or bongo can be present at the beginning.

To do it I successfully used the following regular expression:

/'(?:bruce|bongo|)\/?([^']+)'/

In a similar way I want to remove a trailing group.
Let' say that if the string ends with sam or mark I want to remove this portion as well, for example:

'emily/grace/poppy/sam' -> emily/grace/poppy
'connor/barnaby/mark' -> connor/barnaby

Again, only one word of the group can be present at the end, in the example only sam or mark can end the string.

I thought to use the same as above and going with something similar to:

/'(?:bruce|bongo|)\/?([^']+)(?:sam|mark|)'/

But it's not working: bruce or bongo are removed if present, while sam or mark are always kept if present.

I know I can extract the match as it is and remove it with string manipulation methods. I am using javascript at the moment, and I can use:

"bruce/john/charles/sam".replace(/^(?:bruce|bongo)\//, '').replace(/\/(?:sam|mark)$/, '');

But I was wondering if there's a way to achieve the same result using directly the initial regular expression I execute against the long original string.

What am I missing?

Upvotes: 0

Views: 86

Answers (1)

trincot
trincot

Reputation: 351369

You just have to make the middle part lazy, by adding a ? after the +:

'(?:bruce|bongo|)\/?([^']+?)(?:sam|mark|)'

And if you want the capture group to exclude the / that occurs before sam or mark, then:

'(?:bruce|bongo|)\/?([^']+?)(?:\/sam|\/mark|)'

Upvotes: 1

Related Questions