Reputation: 311
I have a file with contents like this:
HFH_F_OPL_J0 ;comment1
HIJ_I_AAA_V2_DSD ;comment2
ALE_H_FB_V1 ;comment3
ZXZPOIF_P ;comment4
RST0DREK_S ;comment5
I need to match the single character, always present after the first underscore, and always one of [H, I, F, P, L, S] only.
What regex is to be used for this?
/(\w{3,})_([S|I|P|F|L|H]{1})(.*)\;/
does not give the right results.
Upvotes: 0
Views: 39
Reputation: 126742
If you trust your data then there's no reason to check the value of the character right after the first underscore -- you can just grab it and use it
This short Perl program demonstrates
use strict;
use warnings 'all';
use feature 'say';
while ( <DATA> ) {
say $1 if /_(.)/;
}
__DATA__
HFH_F_OPL_J0 ;comment1
HIJ_I_AAA_V2_DSD ;comment2
ALE_H_FB_V1 ;comment3
ZXZPOIF_P ;comment4
RST0DREK_S
F
I
H
P
S
If you want to be slightly more secure then you can use a character class instead of a dot, which changes that line of my code to
say $1 if /_([HIFPLS])/;
The output is identical to that of the original code
Upvotes: 1
Reputation: 174796
Use anchors and change the first \w
to [A-Z]
because \w
should also match _
. Now, get the Character you want from group index 1.
/^[A-Z]{3,}_([SIPFLH]).*;/
or
/^[^_]{3,}_\K[SIPFLH](?=.*;)/
Upvotes: 1