Reputation: 5244
I recently was asked to design an algorithm that checks if two strings are anagrams of one another. My goal was to minimize space and time complexity, so I came up with this algorithm:
However, the time complexity of this algorithm is O(n) and I cannot come up with an algorithm with lower complexity. Does anybody know of one?
Upvotes: 9
Views: 11384
Reputation: 2122
Let's take a question: Given two strings s and t, write a function to determine if t is an anagram of s.
For example, s = "anagram", t = "nagaram", return true. s = "rat", t = "car", return false.
Method 1(Using HashMap ):
public class Method1 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
System.out.println(isAnagram(a, b ));// output => true
}
private static boolean isAnagram(String a, String b) {
Map<Character ,Integer> map = new HashMap<>();
for( char c : a.toCharArray()) {
map.put(c, map.getOrDefault(c, 0 ) + 1 );
}
for(char c : b.toCharArray()) {
int count = map.getOrDefault(c, 0);
if(count == 0 ) {return false ; }
else {map.put(c, count - 1 ) ; }
}
return true;
}
}
Method 2 :
public class Method2 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
System.out.println(isAnagram(a, b));// output=> true
}
private static boolean isAnagram(String a, String b) {
int[] alphabet = new int[26];
for(int i = 0 ; i < a.length() ;i++) {
alphabet[a.charAt(i) - 'a']++ ;
}
for (int i = 0; i < b.length(); i++) {
alphabet[b.charAt(i) - 'a']-- ;
}
for( int w : alphabet ) {
if(w != 0 ) {return false;}
}
return true;
}
}
Method 3 :
public class Method3 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
System.out.println(isAnagram(a, b ));// output => true
}
private static boolean isAnagram(String a, String b) {
char[] ca = a.toCharArray() ;
char[] cb = b.toCharArray();
Arrays.sort( ca );
Arrays.sort( cb );
return Arrays.equals(ca , cb );
}
}
Method 4 :
public class Method4 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
//String c = "gini";
System.out.println(isAnagram(a, b ));// output => true
}
private static boolean isAnagram(String a, String b) {
Map<Integer, Integer> map = new HashMap<>();
a.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) + 1));
b.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) - 1));
//System.out.println(map.values());
for(int count : map.values()) {
if (count<0) return false;
}
return true;
}
}
Upvotes: 2
Reputation: 23
int anagram (char a[], char b[]) {
char chars[26];
int ana = 0;
int i =0;
for (i=0; i<26;i++)
chars[i] = 0;
if (strlen(a) != strlen(b))
return -1;
i = 0;
while ((a[i] != '\0') || (b[i] != '\0')) {
chars[a[i] - 'a']++;
chars[b[i] - 'a']--;
i++;
}
for (i=0; i<26;i++)
ana += chars[i];
return ana;
}
void main() {
char *a = "chimmy\0";
char *b = "yimmch\0";
printf ("Anagram result is %d.\n", anagram(a,b));
}
Upvotes: -2
Reputation: 35540
You could possibly improve average performance with early exits. While scanning the 2nd string, if count[char] is 0 before you decrement, you don't have an anagram and you can stop scanning.
Also, if the strings are shorter than 26 chars, then in the last step, check only the chars in the first string for zeroes.
This doesn't change the big O, but it can change your average runtime to something less than the 2N+26 o the proposed solution, depending on your data.
Upvotes: 4
Reputation: 372814
Your algorithm is asymptotically optimal. It's not possible to solve this problem in any better than Ω(n) time. To see this, suppose that an algorithm A exists that can solve the problem in o(n) time (note that this is little-o of n here). Then for any 1 > ε > 0, there is some n such that for any input of size at least n, the algorithm must terminate in at most εn steps. Set ε = 1/3 and consider any inputs to the algorithm that are of length at least n for the aforementioned n for this ε. Since the algorithm can look at most 1/3 of the characters in the two strings, then there must be two different inputs to the function, one that is a pair of anagrams and one that isn't, such that the algorithm looks at the same subset of the characters of each input. The function would then have to produce the same output in each case, and thus would be wrong on at least one of the inputs. We've reached a contradiction, so no such algorithm must exist.
Upvotes: 18
Reputation: 2504
To be sure the strings are anagrams you need to compare the whole strings - so how could that be faster than o(n)?
Upvotes: 1