Reputation: 22668
I need to split a string using space but keep together the words surrounded by a specific character.
The specific characters can be `
, *
or **
.
Let me give an example:
The `String class` represents character strings.
All *string literals* in **Java programs**, such as **abc**
I want to have this result:
The
`String class`
represents
character
strings.
All
*string literals*
in
**Java programs**
,
such
as
**abc**
I am able to write regexp which split my input string to parts if I have only one kind of marker character. But unfortunately, I have multiply markers.
This is the regexp I use in my code: [^\s"]+|"[^"]*("|$)
. This works fine only with one marker:
String marker = "`";
String data = "The `String class` represents character strings. All *string literals* in **Java programs**, such as **abc**...";
String regexp = "[^\\s" + marker + "]+|" + marker + "[^" + marker + "]*(" + marker +"|$)";
Pattern pattern = Pattern.compile(regexp);
Matcher regexMatcher = pattern.matcher(data);
while (regexMatcher.find()) {
System.out.println(regexMatcher.group());
}
Output:
The
`String class`
...
*string
literals*
in
**Java
programs**,
such
as
**abc**...
I have tried to stick multiply markers, but the following solution does not work:
String marker = "`|\*"
I can write java code to do this job, but I thought that using regexp can be easier. But I am not sure about it now.
Upvotes: 2
Views: 64
Reputation: 18611
You may extract them with
`[^`]*`|(\*{1,2}).*?\1|\S+
See proof. This pattern will match strings between backticks, single- or double asterisks, and any non-whitespace chunks.
Use double backslash in Java code:
String regex = "`[^`]*`|(\\*{1,2}).*?\\1|\\S+";
Upvotes: 1