user18045244
user18045244

Reputation: 65

How to find a string at a specific location mixed not english in java?

How to find a string at a specific location with regex?

choryangStn_110_220114_일_0.sbm

choryangStn_110_220114_이_0.sbm

choryangStn_110_220114_삼_0.sbm

At work, I would like to bring , ,

I tried

String filename = "choryangStn_110_220114_일_0.sbm";
filename.replaceAll(".*_(\\w+)_\\d+\\.\\w+", "$1");

If do like this, it will not work properly.

I wonder how can I satisfy \\w or [가-힣] .

filename.replaceAll(".*_(\\w+)||[가-힣]_\\d+\\.\\w+", "$1");

filename.replaceAll(".*_(\\w+||[가-힣])_\\d+\\.\\w+", "$1");

Both of the above sentences don't work properly.

I wonder how this is possible.

Upvotes: 1

Views: 24

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626896

You can use the following regex with replaceFirst():

(?U)^.*_(\\w+)_\\d+\\.\\w+$

The (?U) is an embedded flag option that is equivalent of Pattern.UNICODE_CHARACTER_CLASS option that makes all shorthand character classes Unicode-aware.

See the regex demo and the Java demo:

import java.util.*;
import java.util.regex.*;

class Test
{
    public static void main (String[] args) throws java.lang.Exception
    {
        String strings[]  = {"choryangStn_110_220114_일_0.sbm",
            "choryangStn_110_220114_이_0.sbm",
            "choryangStn_110_220114_삼_0.sbm"
        };
        String regex = "(?U)^.*_(\\w+)_\\d+\\.\\w+$";
        for(String text : strings)
        {
            System.out.println("'" + text + "' => '" + text.replaceFirst(regex, "$1") + "'");
        }
    }
}

Output:

'choryangStn_110_220114_일_0.sbm' => '일'
'choryangStn_110_220114_이_0.sbm' => '이'
'choryangStn_110_220114_삼_0.sbm' => '삼'

Upvotes: 1

Related Questions