Reputation: 7012
Using Swift4, I would like to sort a string-array according to the closest match to a given searchTerm. Important is to me that if the searchTerm can be found as an exact-match, then the returnArray should show this searchTerm upfront !
Example: Given the Array = ["Hello world", "Hello Jamaica", "Hello", "Family", "Hel"]
And the searchTerm = "Hello"
, the algorithm should return:
["Hello", "Hello world", "Hello Jamaica", "Hel", "Family"]
.
Approach 1: I tried to use FuzzyMatching - and it somehow worked (i.e. it did sort the inputArray according to a given searchTerm, however it did not put the exact-matches upfront ! i.e. With FuzzyMatching I achieved a good sorting according to substring-matches and syntactic sorting. But it did not bring me the exact-matches upfront in the returnArray).
Approach 2: Then I tried my own algorithm - (see code below). But if there are several strings in the array that all start with my searchTerm (i.e. have searchTerm as a prefix), then somehow my algo does not a good job.
static func bestMatchFilterdStringArray(inputArray: [String], searchTerm: String) -> [String] {
let matchingTerms = inputArray
.filter { $0.range(of: searchTerm, options: .caseInsensitive) != nil }
.sorted { ($0.hasPrefix(searchTerm) ? 0 : 1) < ($1.hasPrefix(searchTerm) ? 0 : 1) }
return matchingTerms
}
How is a "Closest-match string-array sorting" done in Swift4? Especially bringing me exact-matches upfront in the returnArray? Any help appreciated!
Upvotes: 7
Views: 2410
Reputation: 171
You can use Levenshtein distance score to compare your search term with every string in the array, and the one with the highest score will be the first term in your result array etc. Your result will be an array of strings sorted in descending order of the score.
Following extension to string can be used to get Levenshtein distance score. In this algorithm, higher the value, better the equality.
extension String {
func levenshteinDistanceScore(to string: String, ignoreCase: Bool = true, trimWhiteSpacesAndNewLines: Bool = true) -> Double {
var firstString = self
var secondString = string
if ignoreCase {
firstString = firstString.lowercased()
secondString = secondString.lowercased()
}
if trimWhiteSpacesAndNewLines {
firstString = firstString.trimmingCharacters(in: .whitespacesAndNewlines)
secondString = secondString.trimmingCharacters(in: .whitespacesAndNewlines)
}
let empty = [Int](repeating:0, count: secondString.count)
var last = [Int](0...secondString.count)
for (i, tLett) in firstString.enumerated() {
var cur = [i + 1] + empty
for (j, sLett) in secondString.enumerated() {
cur[j + 1] = tLett == sLett ? last[j] : Swift.min(last[j], last[j + 1], cur[j])+1
}
last = cur
}
// maximum string length between the two
let lowestScore = max(firstString.count, secondString.count)
if let validDistance = last.last {
return 1 - (Double(validDistance) / Double(lowestScore))
}
return 0.0
}
}
Upvotes: 10