Reputation: 51
Imagine a string like this:
#*****~~~~~~**************~~~~~~~~***************************#
I am looking for an elegant way to find the indices of the longest continues section that contains a specific character. Let's assume we are searching for the *
character, then I expect the method to return the start and end index of the last long section of *
.
I am looking for the elegant way, I know I could just bruteforce this by checking something like
indexOf(*)
lastIndexOf(*)
//Check if in between the indices is something else if so, remember length start from new
//substring and repeat until lastIndex reached
//Return saved indices
This is so ugly brute-force - Any more elegant way of doing this? I thought about regular expression groups and comparing their length. But how to get the indices with that?
Upvotes: 2
Views: 839
Reputation: 28968
If you don't want to hardcode a specific character like *
and find "Find the longest section of repeating characters" as the title of the question states, then the proper regular expression for the section of repeated characters would be:
"(.)\\1*"
Where (.)
a group that consists from of a single character, and \\1
is a backreference that refers to that group. *
is greedy quantifier, which means that presiding backreference could be repeated zero or more times.
Finally, "(.)\\1*"
captures a sequence of subsequent identical characters.
Now to use it, we need to compile the regex into Pattern
. This action has a cost, hence if the regex would be used multiple times it would be wise to declare a constant:
public static final Pattern REPEATED_CHARACTER_SECTION =
Pattern.compile("(.)\\1*");
Using features of modern Java, the longest sequence that matches the above pattern could be found literally with a single line of code.
Since Java 9 we have method Matcher.results()
which return a stream of MatchResult
objects, describe a matching group.
MatchResult.start()
MatchResult.end()
expose the way of accessing start and end indices of the group. To extract the group itself, we need to invoke MatchResult.group()
.
That how an implementation might look like:
public static void printLongestRepeatedSection(String str) {
String longestSection = REPEATED_CHARACTER_SECTION.matcher(str).results() // Stream<MatchResult>
.map(MatchResult::group) // Stream<String>
.max(Comparator.comparingInt(String::length)) // find the longest string in the stream
.orElse(""); // or orElseThrow() if you don't want to allow an empty string to be received as an input
System.out.println("Longest section:\t" + longestSection);
}
main()
public static void printLongestRepeatedSection(String str) {
MatchResult longestSection = REPEATED_CHARACTER_SECTION.matcher(str).results() // Stream<MatchResult>
.max(Comparator.comparingInt(m -> m.group().length())) // find the longest string in the stream
.orElseThrow(); // would throw an exception is an empty string was received as an input
System.out.println("Section start: " + longestSection.start());
System.out.println("Section end: " + longestSection.end());
System.out.println("Longest section: " + longestSection.group());
}
Output:
Section start: 34
Section end: 61
Longest section: ***************************
Links:
Official tutorials on Lambda expressions and Stream API provided by Oracle
A quick tutorial on Regular expressions
You can do it without regular expressions by manually iterating over the indices of the given string and checking if the previous character matches the current one.
You just need to maintain a couple of variables denoting the start and the end of the longest previously encountered section, and a variable to store the starting index of the section that is being currently examined.
That's how it might be implemented:
public static void printLongestRepeatedSection(String str) {
if (str.isEmpty()) throw new IllegalArgumentException();
int maxStart = 0;
int maxEnd = 1;
int curStart = 0;
for (int i = 1; i < str.length(); i++) {
if (str.charAt(i) != str.charAt(i - 1)) { // current and previous characters are not equal
if (maxEnd - maxStart < i - curStart) { // current repeated section is longer then the maximum section discovered previously
maxStart = curStart;
maxEnd = i;
}
curStart = i;
}
}
if (str.length() - curStart > maxEnd - maxStart) { // checking the very last section
maxStart = curStart;
maxEnd = str.length();
}
System.out.println("Section start: " + maxStart);
System.out.println("Section end: " + maxEnd);
System.out.println("Section: " + str.substring(maxStart, maxEnd));
}
main()
public static void main(String[] args) {
String source = "#*****~~~~~~**************~~~~~~~~***************************#";
printLongestRepeatedSection(source);
}
Output:
Section start: 34
Section end: 61
Section: ***************************
Upvotes: 2
Reputation: 19545
The solution based on the regular expression may use some features from Stream API to get an array of indexes of the longest sequence of a given character:
Pattern.quote
should be used to safely wrap the input character for the search within a regular expressionStream<MatchResult>
returned by Matcher::results
provides necessary information about the start and end of the matchStream::max
allows to select the longest matching sequenceOptional::map
and Optional::orElseGet
help convert the match into the desired array of indexespublic static int[] getIndexes(char c, String str) {
return Pattern.compile(Pattern.quote(Character.toString(c)) + "+")
.matcher(str)
.results()
.max(Comparator.comparing(mr -> mr.end() - mr.start()))
.map(mr -> new int[]{mr.start(), mr.end()})
.orElseGet(() -> new int[]{-1, -1});
}
// Test:
System.out.println(Arrays.toString(getIndexes('*', "#*****~~~~~~**************~~~~~~~~***************************#")));
// -> [34, 61]
Upvotes: 1
Reputation: 20914
Use methods of class Matcher
.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Solution {
public static void main(String args[]) {
String str = "#*****~~~~~~**************~~~~~~~~***************************#";
Pattern pattern = Pattern.compile("\\*+");
Matcher matcher = pattern.matcher(str);
int max = 0;
while (matcher.find()) {
int length = matcher.end() - matcher.start();
if (length > max) {
max = length;
}
}
System.out.println(max);
}
}
The regular expression searches for occurrences of one or more asterisk (*
) characters.
Method end
returns the index of the first character after the last character matched and method start
returns the index of the first character matched. Hence the length is simply the value returned by method end
minus the value returned by method start
.
Each subsequent call to method find
starts searching from the end of the previous match.
The only thing left is to get the longest string of asterisks.
Upvotes: 2