Reputation: 10866
My requirement is to be able to match two strings that are similar but not an exact match. For example, given the following strings
The output should be FirstName, FName and Last Name, LName as they are a logical match. Are there any libraries that I could use to do this? I am using JAVA to achieve this functionality.
Thanks Raam
Upvotes: 1
Views: 6512
Reputation: 14580
You could use Apache Commons StringUtils...
But it's worth noting that this may not be the best algorithm for the specific use-case in the question - I recommend reading some of the other answers here for more ideas.
Upvotes: 4
Reputation: 1467
StringUtils is simply best for this - this is one of the examples i found on stackOverflow - as @CupawnTae said already
Below is the one of the simple example i came across
public static Object getTheClosestMatch(Collection<?> collection, Object target) {
int distance = Integer.MAX_VALUE;
Object closest = null;
for (Object compareObject : collection) {
int currentDistance = StringUtils.getLevenshteinDistance(compareObject.toString(), target.toString());
if(currentDistance < distance) {
distance = currentDistance;
closest = compareObject;
}
}
return closest;
}
Upvotes: 2
Reputation: 4661
According to the example you gave, you should use a modified Levenshtein distance where the penalty for adding spaces is small and the penalty for mismatched characters is larger. This will handle matching abbreviations to the strings that were abbreviated quite well. However that's assuming that you're mainly dealing with aligning abbreviations to corresponding longer versions of the strings. You should elaborate more exactly what kind of matchings you want to perform (e.g. more examples, or some kind of high-level description) if you want a more detailed and pointed answer about what methods you can/should use.
Upvotes: 2
Reputation: 410
The spell check algorithms use a variant of this algorithm. http://en.wikipedia.org/wiki/Levenshtein_distance. I implemented it in class for a project and it was fairly simple to do so. If you don't want to implement it yourself you can use the name to search for other libraries.
Upvotes: 1
Reputation: 55
An answer to a really similar question to yours can be found here.
Also, wikipedia has an article on Approximate String Matching that can be found here. If the first link isn't what you're looking for, I would suggest reading the wikipedia article and digging through the sources to find what you need.
Sorry I can't personally be of more help to you, but I really hope that these resources can help you find what you're looking for!
Upvotes: 1