olala
olala

Reputation: 4436

How to regex extract something from a string

I have this line:

[1] "RPKM_AB123_Gm12878_control.extended.bed_28m_control_500 and RPKM_AB156_GM12878-50ng_test.extended.bed_28m_test_500"

and I want to extract AB123_Gm12878_control and AB156_GM12878-50ng from the string.

I have tried this and it isn't working yet.

if ($_ =~ /.*"RPKM_([\w.]+).extended.+\s\w+\sRPKM_([\w.]+).extended.+"/){
   print $1,"\t",$2,"\t";
}

Can someone point out where I did it wrong? Thanks!

Upvotes: 1

Views: 122

Answers (2)

mpapec
mpapec

Reputation: 50637

You can simplify regex and match all occurrences using /g

if ( my($m1, $m2) = /RPKM_([^.]+)/g ) {
  print $m1,"\t",$m2,"\t";
}

Upvotes: 1

Jerry
Jerry

Reputation: 71538

".*RPKM_([\w.]+).extended.+\s\w+\sRPKM_([\w.]+).extended.+"
                                        ^^^^^

This character class is not accepting - which the string your matching against contains.

Try putting the hyphen in:

".*RPKM_([\w.]+)\.extended.+\s\w+\sRPKM_([\w.-]+)\.extended.+"

Also, it's good to escape the periods.

Upvotes: 3

Related Questions