Anupama G
Anupama G

Reputation: 311

Capturing a particular character from a string in Perl

I have a file with contents like this:

HFH_F_OPL_J0                                       ;comment1
HIJ_I_AAA_V2_DSD                                   ;comment2
ALE_H_FB_V1                                        ;comment3
ZXZPOIF_P                                              ;comment4
RST0DREK_S                                              ;comment5

I need to match the single character, always present after the first underscore, and always one of [H, I, F, P, L, S] only.

What regex is to be used for this?

/(\w{3,})_([S|I|P|F|L|H]{1})(.*)\;/ 

does not give the right results.

Upvotes: 0

Views: 39

Answers (2)

Borodin
Borodin

Reputation: 126742

If you trust your data then there's no reason to check the value of the character right after the first underscore -- you can just grab it and use it

This short Perl program demonstrates

use strict;
use warnings 'all';
use feature 'say';

while ( <DATA> ) {
    say $1 if /_(.)/;
}

__DATA__
HFH_F_OPL_J0                                       ;comment1
HIJ_I_AAA_V2_DSD                                   ;comment2
ALE_H_FB_V1                                        ;comment3
ZXZPOIF_P                                              ;comment4
RST0DREK_S

output

F
I
H
P
S

If you want to be slightly more secure then you can use a character class instead of a dot, which changes that line of my code to

say $1 if /_([HIFPLS])/;

The output is identical to that of the original code

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174796

Use anchors and change the first \w to [A-Z] because \w should also match _. Now, get the Character you want from group index 1.

/^[A-Z]{3,}_([SIPFLH]).*;/ 

or

/^[^_]{3,}_\K[SIPFLH](?=.*;)/ 

DEMO

Upvotes: 1

Related Questions