Amarendra Reddy
Amarendra Reddy

Reputation: 243

String Matching for patterns

I have 2 strings of pattern a.{var1}.{var2} and b.{var1}.{var2}.

Two Strings are matching if var1 in first string is same as var1 in second string, as well as var2 in first string is same as var2 in second string.

The variables can be any order like a.{var1}.{var2} and b.{var2}.{var1}.

How do I match the two strings efficiently?

Example 1:

String pattern1 = "1.{var1}";
String pattern2 = "2.{var1}";

//Match True = (1.111,2.111)
//Match False = (1.121,2.111)

Example 2:

String pattern1 = "1.{var1}.{var2}";
String pattern2 = "2.{var1}.{var2}";

//Match True = (1.11.22,2.11.22)
//Match False = (1.11.22,2.111.22)

Example 3:

String pattern1 = "1.{var1}.{var2}";
String pattern2 = "2.{var2}.{var1}";

//Match True = (1.22.11,2.11.22)
//Match False = (1.11.22,2.111.22)

So whats the best way to match these 2 strings?

I want to match these 2 strings to find if they are related with the pattern mentioned.
Extending this problem to a set of strings i.e Set A strings has to be matched with strings in Set B. Finally pairs of strings have to be formed which satisfy this matching algorithm. The pattern will remain the same when matching for all strings in Set A to Set B.

Upvotes: 2

Views: 314

Answers (6)

Anonymous
Anonymous

Reputation: 86276

Use String.split() and then String.equals() on the resulting array elements, handling your three cases separately.

After splitting, first check that both the resulting arrays have the same length (if not they don’t match). Also use String.equals() for checking that the first element is "1" and "2" if this is required. Then branch on whether the length is 2 or 3. If length is 2, check that the match is as in your example 1; again use String.equals() on the array elements. If length is 3, you need to check both orders of the variable parts in accordance with your two examples 2 and 3.

Remember that the argument to String.split() is a regular expression, and that the dot has a special meaning in regular expressions. So you need to use .split("\\."), not .split(".").

It should run pretty fast too. However, don’t start optimizing until you know you need better performance. Readability is king.

Edit: I present my own solution:

public static boolean match(String s1, String s2) {
    String[] a1 = s1.split("\\.", 4);
    String[] a2 = s2.split("\\.", 4);
    if (a1.length != a2.length) {
        return false;
    }
    if (a1[0].equals("1") && a2[0].equals("2")) {
        if (a1.length == 2) {
            return a1[1].equals(a2[1]);
        } else if (a1.length == 3) {
            return (a1[1].equals(a2[1]) && a1[2].equals(a2[2]))
                    || (a1[1].equals(a2[2]) && a1[2].equals(a2[1]));
        }
    }
    return false;
}

Trying it with the 6 examples from the question:

System.out.println("(1.111,2.111)      " + match("1.111", "2.111"));
System.out.println("(1.121,2.111)      " + match("1.121", "2.111"));
System.out.println("(1.11.22,2.11.22)  " + match("1.11.22", "2.11.22"));
System.out.println("(1.11.22,2.111.22) " + match("1.11.22", "2.111.22"));
System.out.println("(1.22.11,2.11.22)  " + match("1.22.11", "2.11.22"));
System.out.println("(1.11.22,2.111.22) " + match("1.11.22", "2.111.22"));

This prints:

(1.111,2.111)      true
(1.121,2.111)      false
(1.11.22,2.11.22)  true
(1.11.22,2.111.22) false
(1.22.11,2.11.22)  true
(1.11.22,2.111.22) false

Upvotes: 0

FAIZAN
FAIZAN

Reputation: 333

This might not be the most efficient way of doing this, but it does give you the expected output.

01/05: Code updated after an error pointed out by Ole in the comments::

private boolean compareStr(String a, String b) {
    ArrayList<String> aList = new 
    ArrayList<String>(Arrays.asList(a.split("\\.")));
    ArrayList<String> bList = new ArrayList<String>(Arrays.asList(b.split("\\.")));
    bList.remove(0);
    aList.remove(0);

    if(aList.size() != bList.size())
            return false;

    boolean aMatchFlag = false;
    for(int i=0; i< aList.size(); i++){
        if (!bList.contains(aList.get(i))) {
            return false;
        }
    }
    aMatchFlag = true;
    System.out.println("All elements of A are present in B");
    boolean bMatchFlag = false;
    for(int i=0; i< bList.size(); i++){
        if (!aList.contains(bList.get(i))) {
            return false;
        }
    }
    bMatchFlag = true;
    System.out.println("All elements of B are present in A");

    if(aMatchFlag && bMatchFlag)
            return true;
    else
            return false;
}

For those also looking for the performance of the code

Input:1.11.11, 2.11.11.11
Compilation time: 1.45 sec, absolute running time: 0.24 sec, cpu time: 0.26 sec, memory peak: 18 Mb, absolute service time: 1,7 sec

Input:1.11.11, 2.11.22
Compilation time: 1.25 sec, absolute running time: 0.24 sec, cpu time: 0.23 sec, memory peak: 18 Mb, absolute service time: 1,49 sec

Input:1.11.2, 2.11.22
Compilation time: 1.34 sec, absolute running time: 0.24 sec, cpu time: 0.24 sec, memory peak: 18 Mb, absolute service time: 1,58 sec


Input:1.11.2, 2.11.111
Compilation time: 1.65 sec, absolute running time: 0.28 sec, cpu time: 0.32 sec, memory peak: 18 Mb, absolute service time: 1,94 sec

Upvotes: 2

jferard
jferard

Reputation: 8180

This can be done as follows:

  • While we check if the first string and the first pattern match, we extract a map of the values in the string corresponding to the placeholders (var1, var2, ...) in the pattern;
  • While we check if the second string and the second pattern match, we also check the second string against the values of the placeholders.

This is interesting, because the map placeholder - > value is computed once for a couple (first string, first pattern), and can be used to check every couple (second string, second pattern).

Translation in the code: create an object of type PatternMatcher from (first string, first pattern). This object will contain a map valueByPlaceHolder used to check other couples.

Here are the relevant parts of the code.

Check if string and pattern match + creation of the map:

private static Optional<Map<String, String>> extractValueByPlaceHolder(
        String[] sChunks, String[] patternChunks) {
    // string and pattern should have the same length
    if (sChunks.length != patternChunks.length)
        return Optional.empty();

    Map<String, String> valueByPlaceHolder = new HashMap<>(sChunks.length);
    for (int i = 0; i < patternChunks.length; i++) {
        String patternChunk = patternChunks[i];
        String sChunk = sChunks[i];
        if (isAPlaceHolder(patternChunk)) { // first char = {, last char = }
            valueByPlaceHolder.put(patternChunk, sChunk); // just get the value
        } else if (!patternChunk.equals(sChunk)) {
            // if it's not a placeholder, the chunks should be the same in the string
            // and the pattern
            return Optional.empty(); 
        }
    }
    return Optional.of(valueByPlaceHolder);
}

Check if other string and otherpattern match + comparison with first (string, pattern) couple:

public boolean check(String[] otherChunks, String[] otherPatternChunks) {
    // other string and other pattern should have the same length, other string and string too
    if (otherChunks.length != this.chunks_length || otherChunks.length != otherPatternChunks.length)
        return false;

    for (int i = 0; i < otherChunks.length; i++) {
        String otherPatternChunk = otherPatternChunks[i];
        String otherChunk = otherChunks[i];
        // get the value from the first string if a it's placeholder, else keep the pattern chunk 
        String expectedChunk = this.valueByPlaceHolder
                .getOrDefault(otherPatternChunk, otherPatternChunk);

        // the chunk is neither equal to the value of the placeholder, nor to the chunk of the pattern 
        if (!expectedChunk.equals(otherChunk))
                return false;
    }
    return true;
}

Upvotes: 0

Amit
Amit

Reputation: 656

You can use following String class methods:

boolean regionMatches(int toffset, String other, int ooffset, int len)

Tests whether the specified region of this string matches the specified region of the String argument. Region is of length len and begins at the index toffset for this string and ooffset for the other string.

For ignoring case:

boolean regionMatches(boolean ignoreCase, int toffset, String other, int ooffset, int len)

More information : https://docs.oracle.com/javase/tutorial/java/data/comparestrings.html

Or try to create a Regex pattern dynamically from one string and compare with other ...though not an efficient approach

Upvotes: 1

Andrea Ligios
Andrea Ligios

Reputation: 50203

Remove the patterns from the String, extract the vars from the String by splitting around the dot (assuming your vars has no dots inside), put them in a Set (Sets don't retain the order and hence automatically solve the problem you have with ignoring the position), Check the equality of the Sets.

Running demo: https://ideone.com/5MwOHC

Example code:

final static String pattern1head = "blablabla.";
final static String pattern2head = "yada yada.";

private static Set<String> extractVars(String v){
    if      (v.startsWith(pattern1head)) { v = v.replace(pattern1head,""); }
    else if (v.startsWith(pattern2head)) { v = v.replace(pattern2head,""); }
    else                                 { return null; }           

    return new HashSet<String>(Arrays.asList(v.split("\\.")));
}

private static void checkEquality(String v1, String v2) {
    System.out.println("\n"+v1+" == "+v2+" ? " + extractVars(v1).equals(extractVars(v2)));  
} 


public static void main (String[] args) throws java.lang.Exception {
    String v1 = "blablabla.123.456";
    String v2 = "yada yada.123.456";
    String v3 = "yada yada.456.123";
    String v4 = "yada yada.123.456789";

    checkEquality(v1,v2);
    checkEquality(v1,v3);
    checkEquality(v1,v4);
    checkEquality(v2,v3);
    checkEquality(v2,v4);
}

Output:

blablabla.123.456 == yada yada.123.456 ? true

blablabla.123.456 == yada yada.456.123 ? true

blablabla.123.456 == yada yada.123.456789 ? false

yada yada.123.456 == yada yada.456.123 ? true

yada yada.123.456 == yada yada.123.456789 ? false

Upvotes: 0

Min
Min

Reputation: 352

I suppose following:

string[] arr1 = pattern1.split
string[] arr2 = pattern2.split
int hash1 = arr1[0].hashCode() + arr1[1].hashCode();
int hash2 = arr2[0].hashCode() + arr2[1].hashCode();
if(hash1 = hash2)=> pattern1 == pattern2

Upvotes: 0

Related Questions