Reputation: 21

How should I refactor this to get better performance

I am using this function to get scores for keys that are present in the map. The hash-map has around 80,000 items. The keys are all vectors of strings. The vectors, passed as keys, to the score function are about 22 but the function is called 64000 times in a single iteration of the calling function. This takes around 1282.044314 msecs per iteration.

This is painfully slow since the calling algorithm (global linear model used for tagging) needs to make millions of iterations. How can I fix this bottle neck?

1. (defn score[map keys]
2.  (loop [acc 0 keys (seq keys)]
3.     (if keys
4.     (recur
5.         (+ (map (first keys) 0)
6.          acc)
7.       (next keys))
8.     acc)))


9. (defn yy1
10. "Where 'model' is a hash-map of vectors 
11.       e.g ["BIGRAM:POS-1:WORD:POS", "DT", "Man", "NNP"],
12.  'sent' is a vector of strings e.g ["The", "man", "came"],
13.  'i' an  integer, 
14.  't' a string e.g "NNP", 
15.  't1' also a string e.g "VBD",
16.  'null' is a string "NULL",
17.  'f's e.g 'f9' are strings like "BIGRAM:POS-2:WORD:POS""
18.
19.   [model sent i t t1]
20.   (let [word-1 (get sent (- i 1) null)
21.          word (sent i)
22.          word+1 (get sent (+ i 1) null)
23.          word+2 (get sent (+ i 2) null)
24.          word-2 (get sent (- i 2) null)
25.          features  [[f9  t1      word     t]
26.                     [f10 t1      t]
27.                     [f11 word-1  t1       word]
28.                     [f12 word-1  t1       word     t]
29.                     [f13 word-2  t1       word     t]
30.                     [f14 word-2  t1       t]
31.                     [f15 word    t]
32.                     [f16 word-1  word     t]
33.                     [f17 word-1  t]
34.                     [f18 word-2  word]
35.                     [f19 t       word+1]
36.                     [f20 word    t        word+1]
37.                     [f21 word-2  word-1   t]
38.                     [f22 word-2  word-1   word     t]
39.                     [f23 word    t        word+1   word+2]
40.         ]]
41.      (score model features)
42. ))


43. (defn yy1y2[model sent i t t1 t2]
44.    (let [word-1 (get sent (- i 1) null)
45.          word-2 (get sent (- i 2) null)
46.          word (sent i)
47.          features [[ f1 t2  word   t]
48.                    [ f2 t2  t]
49.                    [ f3 t2  t1     t]
50.                    [ f4 t2  t1     word    t]
51.                    [ f5 t2  t1     word-1  word    t]
52.                    [ f6 t2  word-2  t1     word-1  word  t]
53.                    [ f7 t2  word-1  word    t]
54.                    [ f8 t2  word-1  t]]]
55.     (score model features)
56.  ))

57. (defn viterbi
58.   "'model' is a hash-map e.g {["UNIGRAM:WORD:POS" "John" "NNP"] 1}
59.   'tags' is a hash-map, with keys :U :V :T e.g {:U ["NNP" "NN"]} 
60.   'sent' is a vector of strings e.g ["Jonh" "is" "tall"]"
61.                                                       
62.   [model tags sent]
63.   (let [pi (atom {[-1 * *] 1})]
64.     (let [{:keys [U V T]} tags
65.         N (range (count sent))
66.        ]
67.       (doall (for [k N u U v V]
68.         (let [const (yy1 model sent k v u)
69.               g #(yy1y2 model sent k v u %)
70.               k-1 (- k 1)
71.               [score t] (apply max-key first
72.                        (map
73.                        (fn[t] [(+ (@pi [k-1 t u] ep) const (g t))  t])
74.                        T))
75.              ]
76.           (swap! pi assoc (with-meta [k u v] {:t t}) score)))))
77.     @pi))

Upvotes: 0

Answers (2)

tanzoniteblack

Reputation: 671

The real problem here isn't that your addition code is slow, it's that using a vector as a key in a hash-map is slow.

test.core> (let [test-k ["a" "b"]
                 test-m {test-k 5}]
             (crit/quick-bench (= (test-m test-k) 5)))
WARNING: Final GC required 47.01152815855813 % of runtime
Evaluation count : 4410912 in 6 samples of 735152 calls.
             Execution time mean : 138.320973 ns
    Execution time std-deviation : 4.615398 ns
   Execution time lower quantile : 132.946422 ns ( 2.5%)
   Execution time upper quantile : 144.346280 ns (97.5%)
                   Overhead used : 1.781294 ns
nil
test.core> (let [test-k "ab"
                 test-m {test-k 5}]
             (crit/quick-bench (= (test-m test-k) 5)))
WARNING: Final GC required 47.87422755471501 % of runtime
Evaluation count : 35007324 in 6 samples of 5834554 calls.
             Execution time mean : 15.469454 ns
    Execution time std-deviation : 0.187772 ns
   Execution time lower quantile : 15.237863 ns ( 2.5%)
   Execution time upper quantile : 15.748780 ns (97.5%)
                   Overhead used : 1.781294 ns

Found 2 outliers in 6 samples (33.3333 %)
    low-severe   1 (16.6667 %)
    low-mild     1 (16.6667 %)
 Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
nil

You can see from above that using a string as a key to a hashmap is approximately an order of magnitude faster than using a vector. If you can refactor your code to not use a vector, but something with a faster hashing/equality function, you'll see a similar boost in performance in running your code.

Unfortunately, without knowing how you're creating your data, I can't give a good idea of what might be a more performant key that makes sense with the data structures involved.

Upvotes: 1

noisesmith

Reputation: 20194

I was able to take off a few cycles with type hints.

link to the input I used: input

(defn new-score [map keys]
  (loop [acc 0 keys (seq keys)]
    (if keys
      (recur
        ^long (+ ^long (map (first keys) 0)
                 acc)
       (next keys))
     acc)))

criterium shows this result:

user=> (def used (take 100 (shuffle (keys input))))
#'user/used
user=> (score input used)
113821306187
user=> (new-score input used)
113821306187
user=> (crit/bench (score input used))
Evaluation count : 14114040 in 60 samples of 235234 calls.
             Execution time mean : 4.347602 µs
    Execution time std-deviation : 72.243110 ns
   Execution time lower quantile : 4.255827 µs ( 2.5%)
   Execution time upper quantile : 4.503442 µs (97.5%)
                   Overhead used : 1.841861 ns

Found 2 outliers in 60 samples (3.3333 %)
    low-severe   2 (3.3333 %)
 Variance from outliers : 6.2475 % Variance is slightly inflated by outliers
nil
user=> (crit/bench (new-score input used))
Evaluation count : 17347620 in 60 samples of 289127 calls.
             Execution time mean : 3.482177 µs
    Execution time std-deviation : 125.513670 ns
   Execution time lower quantile : 3.405872 µs ( 2.5%)
   Execution time upper quantile : 3.690737 µs (97.5%)
                   Overhead used : 1.841861 ns

Found 6 outliers in 60 samples (10.0000 %)
    low-severe   4 (6.6667 %)
    low-mild     2 (3.3333 %)
 Variance from outliers : 22.2420 % Variance is moderately inflated by outliers
nil

Upvotes: 1

How should I refactor this to get better performance

Answers (2)

Related Questions