Reputation: 726
I'm trying to find all occurrences of a substring in a string in Java.
For example: searching "ababsdfasdfhelloasdf" for "asdf" would return [8,17] since there are 2 "asdf"'s, one at position 8 and one at 17. Searching "aaaaaa" for "aa" would return [0,1,2,3,4] because there is an "aa" at positions 0,1,2,3, and 4.
I tried this:
public List<Integer> findSubstrings(String inwords, String inword) {
String copyOfWords = inwords;
List<Integer> indicesOfWord = new ArrayList<Integer>();
int currentStartIndex = niwords.indexOf(inword);
int indexat = 0;
System.out.println(currentStartIndex);
while (cthing1 > 0) {
indicesOfWord.add(currentStartIndex+indexat);
System.out.println(currentStartIndex);
System.out.println(indicesOfWord);
indexat += cthing1;
copyOfWords = copyOfWords.substring(cthing1);
System.out.println(copyOfWords);
cthing1 = copyOfWords.indexOf(inword);
}
This problem can be solved in Python as follows:
indices = [m.start() for m in re.finditer(word, a.lower())]
where "word" is the word I'm looking for and "a" is the string I'm searching through.
How can I achieve this in Java?
Upvotes: 12
Views: 27256
Reputation: 36043
Using a regex is definitely an overly heavy solution for finding substrings, and it'll especially be a problem if your substring contains special regex characters like .
. Here's a solution adapted from this answer:
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
List<Integer> result = new ArrayList<Integer>();
while(lastIndex != -1) {
lastIndex = str.indexOf(findStr,lastIndex);
if(lastIndex != -1){
result.add(lastIndex);
lastIndex += 1;
}
}
Upvotes: 5
Reputation: 627537
You can use capturing inside a positive look-ahead to get all overlapping matches and use Matcher#start
to get the indices of the captured substrings.
As for the regex, it will look like
(?=(aa))
In Java code:
String s = "aaaaaa";
Matcher m = Pattern.compile("(?=(aa))").matcher(s);
List<Integer> pos = new ArrayList<Integer>();
while (m.find())
{
pos.add(m.start());
}
System.out.println(pos);
Result:
[0, 1, 2, 3, 4]
See IDEONE demo
Upvotes: 15