Sergio Figueras
Sergio Figueras

Reputation: 297

Regex to exclude first and last characters of match

I've the following string: word_word2_word3_word4

My intention is to extract only 'test2'. Using _\w*?_ as string match, I can get 'word2' as match, but I can't see a way of removing those underscores to match only 'word2'.

I can't use .split() or something like it, this value must be gathered using Regex only.

What modifications do you suggest guys?

Upvotes: 4

Views: 13778

Answers (3)

Caio Oliveira
Caio Oliveira

Reputation: 1243

You can also use positive lookahead and lookbehind

(?<=_)\w*2(?=_)

My intention is to extract only 'test2'. Using \w*? as string match, I can get 'word2' as a match, but I can't see a way of removing those underscores to match only 'word2'.

The underscores won't be part of the matching string but will be before and after it

EDIT:

Going further, if the match string is on the beginning or end of the whole text, it won't be surrounded by underscores.

(?<=_|^)[^_]*2(?=_|$)

This one makes optional the use of underscore in this specific situation.

online test

Upvotes: 13

BeeOnRope
BeeOnRope

Reputation: 65056

Your question isn't entirely clear, but assuming that word2, word3, etc are arbitrary words which don't contain _, you can use capturing groups to extract a subset of the characters that want to extract. For example:

\w*_(\w*)_\w*_\w*

That matches your string and the first (and only) capture group extracts the second word. Maybe the * should be + depending on whether you want to accept zero-sized words.

Upvotes: 1

baseballlover723
baseballlover723

Reputation: 683

A quick and dirty way to do that if your not worried about performance would to just remove the first and last character of a match.

Upvotes: 0

Related Questions