Reputation: 338
I was solving online problem and the task was something like this:
There are two arrays: numbers and prefixes.
- Array
numbers
contains numbers: “+432112345”, “+9990”, “+4450505”- Array
prefixes
contains prefixes: “+4321”, “+43211”, “+7700”, “+4452”, “+4”Find longest prefix for each number. If no prefix found for number, match with empty string.
For example:
- “+432112345” matches with the longest prefix “+43211” (not +4321, cause 43211 is longer).
- “+9990” doesn't match with anything, so empty string "".
- “+4450505” matches with “+4” (“+4452” doesn’t match because of the 2).
I came up with the most straight forward solution where I loop through each number with each prefix. So each time new number, I check prefixes, if some prefix is longer than last one, I will change.
Map<String, String> numsAndPrefixes = new HashMap<>();
for (String number : A) {
for (String prefix : B) {
if (number.contains(prefix)) {
// if map already contains this number, check for prefixes.
// if longer exists, switch longer one
if (numsAndPrefixes.containsKey(number)) {
int prefixLength = prefix.length();
int currentLen = numsAndPrefixes.get(number).length();
if (prefixLength > currentLen) {
numsAndPrefixes.put(number, prefix);
}
} else {
numsAndPrefixes.put(number, prefix);
}
} else if (!number.contains(prefix) && !numsAndPrefixes.containsKey(number)){
numsAndPrefixes.put(number, "");
}
}
}
So it will have two for loops. I see that each time I am doing the same job over and over, e.g checking for prefixes. It works, but it is slow. The problem is that I can’t come up with anything better.
Could someone explain how they would approach to find better algorithm?
And more general, how do you proceed if you have somewhat working solution and trying to find better one? What knowledge am I still missing?
Upvotes: 2
Views: 1515
Reputation: 2316
You are using .contains()
. You should use .startsWith()
. It's a lot faster.
Then in your else if
you are checking what you already checked in the if
.
This is only one approach on how to improve the algorithm:
Sort the prefixes:
+43211 +4321 +4452 +4 +7700
What is good about this? Well, it will always find the longest prefix first. You can exit the loop and don't have to look for longer prefixes.
Arrays.sort(prefixes, new Comparator<String>() {
@Override
public int compare(String o1, String o2) {
return o1.startsWith(o2) ? 1 : o1.compareTo(o2);
}
});
Map<String, String> numsAndPrefixes = new HashMap<>();
for (String number: numbers) {
numsAndPrefixes.put(number, "");
for (String prefix: prefixes) {
if (number.startsWith(prefix, 1)) {
numsAndPrefixes.put(number, prefix);
break;
}
}
}
But if your number starts with +1 and there is no prefix it will continue checking all the prefixes with +2 +3 +4
... which are obviously not matching. (Issue 1)
Also if your number starts with +9
the prefix will be found very late. (Issue 2)
How to fix this? Well you can save the indices where +1 starts, +2 starts, ...:
In our prefix list:
0 1 2 3 4 5 (index)
+1233 +123 +2233 +2 +3 +4
+2 starts at index [2] and +3 starts at index [4]. So when you want to know the prefix for a number starting with +2 you only have to check elements [2] and [3]. This will both fix issue 1 and 2.
It would also be possible to store the indices for more digits (for example where +13 starts).
Upvotes: 0
Reputation: 3890
The data structure you need is trie.
S
in numbers
:
S
:
S
This algorithm works in O(length(prefixes) + length(numbers))
Upvotes: 2
Reputation: 159086
I would implement this using a TreeSet
and the floor(E e)
method.
String[] numbers = { "+432112345", "+9990", "+4450505" };
String[] prefixes = { "+4321", "+43211", "+7700", "+4452", "+4" };
TreeSet<String> prefixSet = new TreeSet<>(Arrays.asList(prefixes));
for (String number : numbers) {
String prefix = prefixSet.floor(number);
while (prefix != null && ! number.startsWith(prefix))
prefix = prefixSet.floor(prefix.substring(0, prefix.length() - 1));
if (prefix == null)
prefix = "";
System.out.println(number + " -> " + prefix);
}
Output
+432112345 -> +43211
+9990 ->
+4450505 -> +4
Upvotes: 3