Reputation: 484
I am trying to make a function more efficient but I have made it worst and I could not understand why. Could someone see why and explain to me please?
Original function:
substringsSB s = substringsSB' Set.empty s
substringsSB' m s = substrings' m s
where
substrings' m s = {-# SCC "substrings'" #-}if (Set.member s m) then m else foldl' insertInits m (init . B.tails $ s)
insertInits m s = {-# SCC "insertInits" #-}if (Set.member s m) then m else foldl' doInsert m (tail . B.inits $ s)
doInsert m k = {-# SCC "doInsert" #-}Set.insert k m
profiling result:
total time = 3.14 secs (157 ticks @ 20 ms)
total alloc = 1,642,067,360 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
doInsert Main 95.5 92.1
insertInits Main 2.5 7.8
substringsSB' Main 1.9 0.0
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 1 0 0.0 0.0 100.0 100.0
main Main 280 1 0.0 0.0 100.0 100.0
substringsSB Main 281 1 0.0 0.0 100.0 100.0
substringsSB' Main 282 1 1.9 0.0 100.0 100.0
doInsert Main 285 1233232 95.5 92.1 95.5 92.1
insertInits Main 284 1570 2.5 7.8 2.5 7.8
substrings' Main 283 1 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 211 3 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 169 2 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 166 1 0.0 0.0 0.0 0.0
As far as I know, we cannot have early-exit in a foldfoldl
, so the function could be spending a lot of time just calling Set.member s m
and return m
in substrings'
. So, I converted the function to use recursion:
substringsSB s = substringsSB' Set.empty s
substringsSB' m str = substrings' m (init . B.tails $ str)
where
substrings' m [] = m
substrings' m (s:ss) | Set.member s m = m
| otherwise = {-# SCC "substrings'" #-}substrings' insertTail ss
where insertTail = insertInits m $ reverse $ (tail . B.inits $ s)
insertInits m [] = m
insertInits m (s:ss) | Set.member s m = m
| otherwise = {-# SCC "insertInits" #-}insertInits (doInsert s m) ss
doInsert k m = {-# SCC "doInsert" #-}Set.insert k m
profiling result:
total time = 5.16 secs (258 ticks @ 20 ms)
total alloc = 1,662,535,200 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
doInsert Main 54.7 90.5
substringsSB' Main 43.8 9.5
insertInits Main 1.6 0.0
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 1 0 0.0 0.0 100.0 100.0
main Main 280 1 0.0 0.0 100.0 100.0
substringsSB Main 281 1 0.0 0.0 100.0 100.0
substringsSB' Main 282 1 43.8 9.5 100.0 100.0
doInsert Main 285 1225600 54.7 90.5 54.7 90.5
insertInits Main 284 1225600 1.6 0.0 1.6 0.0
substrings' Main 283 1568 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 211 3 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 169 2 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 166 1 0.0 0.0 0.0 0.0
But this take more time than the original version.
Why it's spending so much time in substringsSB'
?
It's only doing init . B.tails $ str
which the original version also call...
Or have I made a mistake and these two functions are not logically equivalent?
main = do
s <- getLine
let m = substringsSB $ B.pack s
print $ Set.size m
return ()
with input:
asjasdfkjasdfjkasdjlflaasdfjklajsdflkjasvdadufhsaodifkljaiduhfjknhdfasjlkdfndbhfisjglkasnjjfgklsadmsjnhsjdflkmsnajjkdlsmfnjsdkfljasd;fjlkasdjfklasjdfnasdfjjnsadfjsadfhasjdfjlaksdfjlkasdfjljkasdflasidfjlaisjdflaisdjflaisjdfliasjdgfouqhagdfsia;klsjdfnklajsdfkhkasfhjdasdfhaskdflhjaklsdfh;kjlasdfh;jlaksdflkhajsdfkjahsdfkjhasdfkkasdfkjlkasfdkljasdfkhljkasdkflkjasdfasdlfkajsdlfkjaslkdfjjaksdjgujhgjhghjbjnbghjghhgfghfghvfgfgjhgjhdfjfjhgfjgvjhgvjhgvjhgvjhgvjhgvjhasdkfjkasdjfklajsdfklkahsdfjklhjklhghjhkhgfvcghjkjhghjkjhhvjkl/ljklkjlkjlkjlkjaslkdfjasd;lkfjas;dlfkjas;dflkjas;dflkjas;dflkjas;dflkja;slkdfja;sdlkjfa;sdlkfja;lsdfkjas;ldkfja;sdlkfja;skldfja;slkdjfa;slkdfja;sdklfjas;dlkfjas;dklfjas;dlkfjas;dfkljas;dflkjas;lkdfja;sldkfj;aslkdfja;sldkfja;slkdfj;alksdjf;alsdkfj;alsdkfja;sdflkja;sdflkja;sdlfkja;sdlfkja;sldkfja;sdlkfja;sldfkj;asldkfja;sldkfja;lsdkfja;sldfkja;sdlfjka;sdlfjkas;dlkfjas;ldkfjas;dlfkjasfd;lkjasd;fljkads;flkjasdf;lkjasdf;lkajsdf;lkajsdf;aksljdf;alksjdfa;slkdjfa;slkdjfa;slkdfja;sdflkjas;dflkjasd;flkjasd;flkjasdf;lkjasdf;ljkasdf;lkajdsf;laksjf;asldfkja;sdfljkads;flkjasd;fljkasdf;lkjasdf;ljkadfs;fljkadfs;ljkasdf;lajksdf;lkajsdf;lajsfd;laksdfgvjhgvjhgvjhcfjhgcjfgvjkgvjjgfjghfhgkhkjhbkjhbkjhbkybkkugtkydfktyufctkyckxckghfvkuygjkhbykutgtvkyckjhbliuhgktuyfkvuyjbjkjygvkuykjdjflaksdjflkajsdlkfjalskdjflkasjdflkjasdlkfjalksdjfklajsdflkjasdlkjfalksdjflkasjdflkjasdlfkjaslkdjflaksjdflkajsdlfkjasdlkfjalsdjflkasjdflkasjdflajsdfjsfuhaduvasdyhaweuisfnaysdfiuhasfdnhaksjdfahsdfiujknsadfhbaiuhdfjknahbdshfjksnashdfkjnsadfiukjfnhsdfkjnasdfikjansdfhnaksdjfaisdfkn
Upvotes: 3
Views: 193
Reputation: 183978
The sad truth is that Set.member
is expensive too.
In the first version, you check for each tail if it has been seen before and if so, ignore it, otherwise insert all its nonempty inits. If the input is sufficiently irregular, that's O(n) membership tests and O(n^2) inserts, altogether O(n^2*log n) (assuming O(1) average cost for the comparisons). If the input is periodic with shortest (positive) period p, only the first p tails lead to inserts, so that's O(n) tests and O(p*n) inserts, O(p*n*log n) overall (that's a bit cheated, the average cost for comparisons could be up to O(p) if p > 1 and O(n) if p == 1, but if the period itself is irregular, O(1) for the comparisons is okay).
In the second,
substringsSB s = substringsSB' Set.empty s
substringsSB' m str = substrings' m (init . B.tails $ str)
where
substrings' m [] = m
substrings' m (s:ss) | Set.member s m = m
| otherwise = substrings' insertTail ss
where
insertTail = insertInits m $ reverse $ (tail . B.inits $ s)
you check for each tail if it has been seen before, if so stop. That's good, but doesn't gain much over the first In the first, if a tail has been seen before, all further tails have also been seen before, so you only skip at most O(n) membership tests, O(n*log n) operations. For normally irregular input, only a few of the shortest tails have been seen before, so only few tests are skipped - very little gain.
insertInits m [] = m
insertInits m (s:ss) | Set.member s m = m
| otherwise = insertInits (doInsert s m) ss
doInsert k m = {-# SCC "doInsert" #-}Set.insert k m
If the tail hasn't been seen yet (normal), you start inserting its inits - from longest to shortest - breaking if any has been seen before (because then all shorter inits have also been seen before). That's great if many long inits occur multiple times, but if not, all you have is O(n^2) additional membership tests.
For ordinary irregular input no long substrings occur multiple times, but a number of short ones do, and the few inserts saved do not compensate for the additional membership tests, rendering the second method slower by a constant factor. (Membership testing is cheaper than insertion, so the factor should be less than 2.)
For periodic input, the first method also avoids unnecessary inserts, the second saves O(n) tests in the outer loop, but adds O(p*n) tests in the inner loop, making it slightly worse than in the irregular case.
But for some inputs, the second method can be dramatically better. Try both for
main = do
let x = substringsSB $ B.pack $ replicate 9999 97 ++ [98]
print (Set.size x)
You can improve the second version by replacing the expensive member
before the insert with a cheap size
comparison after the insert,
substringsSB str = go 0 Set.empty (init $ B.tails str)
where
go sz m (s:ss)
| Set.member s m = m
| otherwise = go nsz nm ss
where
(nsz,nm) = insInits sz m (reverse . tail $ B.inits s)
go _ m [] = m
insInits sz m (s:ss)
| sz1 == sz = (sz,m)
| otherwise = insInits sz1 nm ss
where
nm = Set.insert s m
sz1 = Set.size nm
insInits sz m [] = (sz,m)
That brings it close to the first version in the generic case, makes it slightly better (here) than the first version for concat $ replicate n "abcde"
and much better for the evil example above.
Upvotes: 1