iCode
iCode

Reputation: 105

Find string after last underscore before dot extension

I need to find 20140809T0000Z in this string:

PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc

I tried the following to keep the string before the .nc:

(?<=_)(.*)(?=.nc) 

I have the following to start from the last underscore:

/_[^_]*$/

How can I find string after last underscore before dot extension, using a regex?

Upvotes: 1

Views: 3873

Answers (5)

Gleb Kemarsky
Gleb Kemarsky

Reputation: 10398

So, you need a sequence of non-underscore characters that immediately precede the period character.

Try [^_.]+(?=\.)

Demo: https://regex101.com/r/sLAnVs/2

Thanks to Cary Swoveland for pointing out that "no need to escape a period in a character class".

Upvotes: 1

nafas
nafas

Reputation: 5423

RegEx is not always the best solution... :)

String pattern="PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc";
int start=pattern.lastIndexOf("_") + 1;
int end=pattern.lastIndexOf(".");
if(start != 0 && end != -1 && end > start) {
     System.out.println(pattern.substring(start,end);
}

Upvotes: 2

CupawnTae
CupawnTae

Reputation: 14580

You could use a simpler pattern with a capturing group

.*_(.*)\.nc

By default the first .* will be "greedy" and consume as many characters as possible before the _, leaving just the desired string inside the (.*).

Demo: http://regex101.com/r/aI2xQ9/1

Java code:

String input = "PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc";
Pattern pattern = Pattern.compile(".*_(.*)\\.nc");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
   String group = matcher.group(1);
   // ...
}

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174706

You could use the below regex,

(?<=_)[^_]*(?=\.nc)

In your pattern just replace .* with [^_]* so that it would match the inner string.

DEMO

String s = "PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc";
Pattern regex = Pattern.compile("(?<=_)[^_]*(?=\\.nc)");
Matcher regexMatcher = regex.matcher(s);
if (regexMatcher.find()) {
 String ResultString = regexMatcher.group();
 System.out.println(ResultString);
 } //=> 20140809T0000Z

Upvotes: 2

anubhava
anubhava

Reputation: 785146

You just need lookahead for this requirement.

You can use:

[^._]+(?=[^_]*$)

// matches and returns 20140809T0000Z

RegEx Demo

Upvotes: 2

Related Questions