pastelghostiie
pastelghostiie

Reputation: 1

Finding Count of Pattern in a String (Overlap Inclusive)

So I'm trying to write an algorithm that counts the number of occurrences of some pattern, say "aa", within a string, say "aaabca." The number of patterns in that string should return an integer, in this case 2, because the first three characters contain two occurrences of the pattern.

What I have finds the number of patterns under the assumption the existing occurrences of a pattern is NOT overlapping:

  public class Pattern{
   public static void main(String[] args){
   Scanner scan = new Scanner(System.in);
   System.out.println("Enter the string: ");
   String s = scan.nextLine();
   String[] splittedInput = s.split(";");
   String pattern = splittedInput[0];
   String blobs = splittedInput[1];
   Pattern p = new Pattern();
   p.count(pattern, blobs);
}

public static void count(String pattern, String blobs){
    String[] substrings = blobs.split("[|]");
    int numOccurences = 0;
    int[] instances = new int[substrings.length];
    int patternLength = pattern.length();

    for (int i = 0; i < instances.length; i++){
        int length = substrings[i].length();
        String temp = substrings[i];
        temp = temp.replaceAll(pattern, "");
        int postLength = temp.length();
        numOccurences = (length - postLength) / pattern.length();
        instances[i] = numOccurences;
        numOccurences = 0;
    }

    int sum = 0;
    for (int i = 0; i < instances.length; i++){
        System.out.print(instances[i] + "|");
        sum += instances[i];
    }
    System.out.print(sum);

}

}

Any suggestions?

Upvotes: 0

Views: 472

Answers (4)

WJS
WJS

Reputation: 40062

Continually taking substrings and using the startsWith method seems to work pretty well.

      String pat = "ss";
      String str = "kskslsksaaaslsslssskssssllsssss";

      int count = 0;
      while (str.length() >= pat.length()) {
         count += str.startsWith(pat) ? 1 : 0;
         str = str.substring(1);
      }
      System.out.println("count = " + count);

You can also take a similar approach with streams.

      long count = IntStream.range(0, str.length()).mapToObj(
            n -> str.substring(n)).filter(n -> n.startsWith(pat)).count();
      System.out.println("count = " + count);

But in this case I actually prefer the non-stream approach.

Upvotes: 0

Olexander Yushko
Olexander Yushko

Reputation: 2904

If you use Java 8 you can count this value in the following way. Example:

String blobs = "aaabcaaa";
String pattern = "aa";
List<String> strings = Arrays.asList(blobs.split(""));

        long count = IntStream.range(0, strings.size())
                .mapToObj(index -> index < strings.size() - 1 ? strings.get(index) + strings.get(index + 1) : strings.get(index - 1))
                .filter(str -> str.equals(pattern))
                .count();
        System.out.println("Result count: " + count);

Upvotes: 0

Greg Dougherty
Greg Dougherty

Reputation: 3471

For one String, match is the String you're looking for:

int len = theStr.length ();
int start = 0;
int pos;
int count = 0;

while ((start < len) && ((pos = theStr.indexOf (match, start)) >= 0))
{
  ++count;
  start = pos + 1;
}

Upvotes: 0

Nexevis
Nexevis

Reputation: 4667

I would personally compare the pattern as a substring in this case. For example a run of a single String from your array would look like this:

//Initial values
String blobs = "aaaabcaaa";
String pattern = "aab";
String[] substrings = blobs.split("[|]");  

//The code I added that should placed into the loop  
int numOccurences = 0;      
String str = substrings[0];
for (int k = 0; k <= (str.length() - pattern.length()); k++)
{
    if (str.substring(k, k + pattern.length()).equals(pattern))
    {
        numOccurences++;
    }
}
    
    System.out.println(numOccurences);

If you want to run this on each String in your array simply modify String str = substrings[0] to String str = substrings[i] and iterate over the array storing the final numOccurences as you please.

Example Run:

String is aaaabcaaa

Pattern is aa

Output is 5 occurences

Upvotes: 1

Related Questions