Reputation: 365
I have a set of Elements which are prefixes. I've to write a Java method in such a way that whenever it gets an input verify in which prefix it belongs to and return true or false if it matches with any prefixes. Elements are
1900
1901
1902
1903
17082
Input will be
1900123445
1901334455
1800777777
...
Which Data Structure I can use so that it'll not hit the performance. Because the input numbers will be up to 50million at once.
Can anyone help me in this? Thanks in advance.
Upvotes: 1
Views: 262
Reputation: 16499
If possible, test each of these methods and see which is the fastest in context. It will probably depend on a variety of factors, and I can't really predict which one will work best in your specific use case.
The easiest option: store your prefixes in a linked list and check against each one using input.startsWith(prefix)
. Boring.
Let k
be the minimum prefix length. Use a HashMap
where the keys are the first k
digits of each prefix and the items are linked lists containing the rest of each prefix.
For instance, say you have prefixes abcd
, abce
, and xyz
. Then you would store the following:
"abc"
-->("d","e")
, where ("d","e")
is a linked list containing the elements "d"
and "e"
"xyz"
-->("")
(where ""
is the empty string).Call this map prefixes
, and use the following code to determine if a prefix is correct:
public boolean correctPrefix(String input){
LinkedList check = prefix.get(input.substring(0,k))
if (check != null){
for (String n : check){
if (input.substring(k).startsWith(check)) return true;
}
return false;
}
I don't know if it will be fast enough for your purposes, though you haven't told us exactly what those are; still, I don't know of anything faster in Java.
Use a switch
statement. Or rather, use multiple switch statements, one for each possible prefix length. For instance, suppose you have the prefixes 1901
, 1902
, and 20050
:
public boolean correctPrefix(String input){
int pVal;
pVal = Integer.parseInt(input.substring(0,4));
switch (pval){
case 1901: return true;
case 1902: return true;
}
pVal = Integer.parseInt(input.substring(0,5));
switch (pval){
case 20050: return true;
}
return false;
}
This will be a lot more code, but I suspect it will be much faster assuming you have enough prefixes with the same length. Note that if a switch
statement doesn't have very many possible cases, it won't actually be compiled as a true switch
statement but as a series of if/else
blocks, which will cause it to be fairly slow. You should do some mucking about with this, though, and see what you get; it might be worthwhile to throw in some bogus case [wrongprefix]: return false;
statements in, because believe it or not they can actually speed things up.
Actually, as of SE7, switch statements can be used with strings. I'm not sure how efficient this is, but it's an option.
Alternatively, if you're using something prior to SE7, you could try....
You can actually pass a radix to parseInt
, meaning that if you have letters in your prefixes but they're case-insensitive, you can use Integer.parseInt(input.substring(0,4),36)
to get a valid integer value, which you can then use with switch
.
Upvotes: 2
Reputation: 1677
Create a HashSet
and add these 4 values in it. and then parse your input values to get first four integers and call HashSet contains()
method, that will return true/false.
Upvotes: 1
Reputation: 29213
If you only have (relatively) small integers, you could use a sparse array of booleans, where your prefixes will act as indexes.
Alternatively, a HashSet
would be a good choice.
Upvotes: 1
Reputation: 876
Are the prefixes the same length?
If so, you could hash all the prefixes.
When you encounter a new input you would extract the first x digits.
You would look for the presence of the first x digits in the hash - an O(1) operation if the hash is designed well.
Upvotes: 1
Reputation: 23409
Your best bet would be to store the prefix in a HashMap and the value should be a linked list. I am assuming the same prefix can get tagged with various numbers like 19011234 190123444.
Upvotes: 1