Federico Leuze
Federico Leuze

Reputation: 43

Splitting Strings in java without removing the match

I'm trying to split a string without removing the matched string, I was kind of successful as I found that this could be done using (?<=-)|(?=-), but now if I implement it to extract a link, using this regex expression:
((?<=(http:\\/\\/\\S+))|(?=(http:\\/\\/\\S+))) I receive a weird outup.
In fact, splitting this input:

A wonderful serenity has taken possession of http://www.google.com my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart.

gives me this set of strings:

["A wonderful serenity has taken possession of ", "http://w", "w", "w", ".", "g", "o", "o", "g", "g", "l", "e", ".", "c", "o", "m", "my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart."].

EDIT: The successful output should be:

["A wonderful serenity has taken possession of ", "http://www.google.com", "my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart."]

Upvotes: 0

Views: 93

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520898

One viable option here would be to use a formal regex iterator, and search for the following pattern:

\\bhttps?://\\S+\\b|.*?(?=https?://|$)

This pattern will first try to fish out a URL, if it can find, otherwise it will capture all content up, but including, either the next URL or the end of the input. Here is a sample code:

String input = "A wonderful serenity has taken possession of http://www.google.com my entire soul,\n like these sweet mornings of spring which I enjoy with my whole heart.";
String pattern = "\\bhttps?://\\S+\\b|.*?(?=https?://|$)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
List<String> matches = new ArrayList<>();
while (m.find()) {
    matches.add(m.group());
}
System.out.println(matches);

This prints:

[A wonderful serenity has taken possession of ,
 http://www.google.com,
 like these sweet mornings of spring which I enjoy with my whole heart., ]

Upvotes: 2

Related Questions