Peter71
Peter71

Reputation: 2300

SWIFT: Quick cumulation of a lot of strings

I'm saving the begining of a lot of words:

var n = 2
var a = [String]()
for w in words {
     let part = prefix(w,n)
     if !contains(a, part) {
          a.append(part)
     }
}

For some milion words, this takes a lot of time. Is there any quick build-in solution of String/Array/Set ... to do this faster?

EDIT: Thanks to your hints, I could combine these ideas to a much better performance:

let words2 = words.map{prefix($0, n)}
var a = [String]()
var last = ""
for w in words2 {
    if w != last {
        a.append(w)
    }
    last = w
}

First it creates a 2-Character-Array. And in the loop I can simply compare the item to the previous one without any further string operations! Seems to be great. Any more advanced swift improvements?

Upvotes: 1

Views: 58

Answers (1)

Luca Angeletti
Luca Angeletti

Reputation: 59506

If you have no guarantee that the words array is sorted, I would implement the following solution:

let prefixesSet = Set(words.map { prefix($0, length) })

Analysis

Given n: words.count

The invocation of map requires time O(n) I don't know how the initializer of Set is implemented. I can imagine a couple of scenarios:

  1. The initializer does sort the input array in time O(n log n) and then add each element to the Set discarding duplicates O(n) (that after the sort are contiguous). Total time of this scenario: O(n log n)
  2. Each word is added to the Set and indexed using an hash map where the key is the hashValue of the word itself. Here the efficiency is inversely proportional to the collisions. With collision I mean 2 distinct words that happen to have the same hashValue. However I except Cocoa to have a pretty good implementation of hashValue for strings so I think in this case the time could be close to O(n).

Conclusion

As I previously said, this apply if you have no guarantee that words is sorted. On the other hand, if words is sorted I think your second block of code represents the best approach. It only requires time O(n).

Upvotes: 1

Related Questions