Minh Le
Minh Le

Reputation: 381

Regular Expression Split XML in Java

I want to split some XML text into parts:

xmlcontent = "<tagA>text1<tagB>text2</tagB></tagA>";

In C# i use

string[] splitedTexts = Regex.Split(xmlcontent, "(<.*?>)|(.+?(?=<|$))");

The result is

splitedTexts = ["<tagA>", "text1", "<tagB>", "text2", "</tagB>", "</tagA>"]

How can do it in Java?

I have tried

String[] splitedTexts = xmlcontent.split("(<.*?>)");

but the result is not like my expecting.

Upvotes: 2

Views: 3793

Answers (2)

kism3t
kism3t

Reputation: 1361

If you want to use Regex:

public static void main(String[] args) {
    String xmlContent = "<xml><tagA>text1</tagA><tagB>text2</tagB></xml>";
    Pattern pattern = Pattern.compile("(<.*?>)|(.+?(?=<|$))");
    Matcher matcher = pattern.matcher(xmlContent);
    while (matcher.find()) {
        System.out.println(matcher.group());
    }
}

Upvotes: 3

Holger
Holger

Reputation: 298233

The parameter to split defines the delimiter to split at. You want to split before < and after > hence you can do:

String[] splitedTexts = xmlcontent.split("(?=<)|(?<=>)");

Upvotes: 5

Related Questions