Anky
Anky

Reputation: 111

Efficient way to check if an input string contains any punctuation from a string of punctuations

I am working on a logic to detect if an input string contains any punctuation from a string of punctuations.

public boolean detectAnyPunctuation(String input, String punctuationArray){}

The function should return true of any of the punctuation from punctuation array is found in input string. Punctuation Array is not fixed. It could be changed with each function call. Input string cannot exceed 1000 chars.

I am thinking of converting a punctuation array to a char array and then running a loop over the char array to check for character in input string. Time complexity for this would be O(MN) where m are characters in punctuation array and N in input array (worst case).

Finally I implemented using regex as below,

public static boolean detectPunctuations(String in, String pu){ 
String puQ = “[” + pu + “]”; 
Pattern pattern = Pattern.compile(puQ); 
Matcher m = pattern.matcher(in); 
return m.find(); 
}

EDIT: Now I am trying to find if it contains all punctuations from punctuation string or not. It should return true only if all punctuations from punctuation string appear in the input string. Any inputs for this one please ?

Upvotes: 1

Views: 2057

Answers (2)

Bohemian
Bohemian

Reputation: 425043

This is O(n + k):

public boolean detectAnyPunctuation(String input, String punctuationArray) {
    Set<Integer> set = punctuationArray
      .chars().boxed()
      .collect(Collectors.toSet());
    return input.chars().boxed()
      .filter(set::contains)
      .distinct().count() == set.size();
}

All operations are constant time. Total operations is the sum of the lengths of punctuations and input.

Upvotes: 1

Bohemian
Bohemian

Reputation: 425043

Sure:

boolean hit = str.matches(".*[" + punctuation + "].*");

There are no punctuation characters that need escaping when used in a character class.

I think you would find the performance to be pretty good. If the punctuation string is a constant, then build the regex pattern once and reuse it.

Upvotes: 0

Related Questions