What are performance (time to process) affecting factors within a list comprehension?

Question

While learning an example from learn-you-a-haskell "which right triangle that has integers for all sides and all sides equal to or smaller than 10 has a perimeter of 24?"

rightTrianglesOriginal = [ (a,b,c) | c <- [1..10], b <- [1..10], a <- [1..10], a^2 + b^2 == c^2, a+b+c == 24]

I changed the parts of original example and wanted to understand the process (in extreme conditions) underneath.

Do order of the predicates effect the performance?
Do adding predicates (which are otherwise implied by other predicates) affect performance ? (eg. a > b, c > a, c > b?)?
Making a list of tuples on the basis of predicates (1) a > b and (2) c > a and then further apply a^2 + b^2 = c^2 will enhance overall performance or not?
Can there be impact on performance if we change the parameter positions e.g. (a,b,c) or (c, b, a)?
What is the advisable strategy in real life application if such heavy permutation and combination is needed - should we store precalculated answers (as far as possible) for the next use in order to enhance the performance or any other?

rightTriangles = [ (a,b,c) | c <- [1..10], b <- [1..10], a <- [1..10], a^2 + b^2 == c^2]

Gives result almost within no time.

rightTriangles10 = [ (a,b,c) | c <- [1..10], b <- [1..10], a <- [1..10], a^2 + b^2 == c^2, a > b , c > a]

Gives result almost within no time.

rightTriangles100 = [ (a,b,c) | c <- [1..100], b <- [1..100], a <- [1..100], a^2 + b^2 == c^2, a > b , c > a]

Gives result in few seconds.

rightTriangles1000 = [ (a,b,c) | c <- [1..1000], b <- [1..1000], a <- [1..1000], a^2 + b^2 == c^2, a > b , c > a]

I stopped process after 30 minutes. Results were not yet complete.

Please note that, being a beginner, I lack the knowledge to check the exact time taken to process of an individual function.

Daniel Fischer · Accepted Answer

rightTrianglesOriginal = [ (a,b,c) | c <- [1..10], b <- [1..10], a <- [1..10], a^2 + b^2 == c^2, a+b+c == 24]

Does order of the predicates affect the performance?

That depends. In this case, changing the order of predicates doesn't change anything substantial, so the difference - if any - would be very small. Since a^2 + b^2 == c^2 is a bit more expensive to check than a + b + c == 24, and both tests filter out many values, I'd expect a small speedup by swapping the two tests

rightTrianglesSwapped = [ (a,b,c) | c <- [1..10], b <- [1..10], a <- [1..10], a+b+c == 24, a^2 + b^2 == c^2]

but the entire computation is so small that it'd be very hard to measure reliably. In general, you can get big differences by reordering tests and generators, in particular interleaving tests and generators to short-cut dead ends can have a huge impact. Here, you could add a b < c test between the b and a generators to shortcut. Of course changing the generator to b <- [1 .. c-1] would be still more efficient.

Does adding predicates (which are otherwise implied by other predicates) affect performance ? (eg. a > b, c > a, c > b?)?

Yes, but generally very little unless the predicate is unusually expensive to evaluate. In the above, if the predicates hold, you would have an unnecessary evaluation of the implied third predicate. Here the predicate is cheap to compute for the standard number types, and it is not evaluated very often (most candidate triples fail earlier), so the impact would hardly be measurable. But it is additional work to do - the compiler is not smart enough to eliminate it - so it costs additional time.

Making a list of tuples on the basis of predicates (1) a > b and (2) c > a and then further apply a^2 + b^2 = c^2 will enhance overall performance or not?

That depends. If you put a predicate in a place where it can short-cut, that will enhance performance. With these predicates that would require reordering the generators (get a before b, so you can short-cut on c > a). A comparison is also a bit cheaper to compute than a^2 + b^2 == c^2, so even if the overall number of tests increases (the latter condition weeds out more triples than the former), it can still improve performance to do the cheaper tests first (but doing the most discriminating tests first can also be the better strategy, even if they are more expensive, it depends on the relation between cost and power).

Can there be impact on performance if we change the parameter positions e.g. (a,b,c) or (c, b, a)?

Basically, that can't have any measurable impact.

What is the advisable strategy in real life application if such heavy permutation and combination is needed - should we store precalculated answers (as far as possible) for the next use in order to enhance the performance or anyother?

That depends. If a computation is complicated and has a small result, it's better to store the result for reuse. If the computation is cheap and the result large, it's better to recompute. In this case, the number of Pythagorean triples is small and the computation not extremely cheap, so storing for reuse is probably beneficial.

rightTriangles10 = [ (a,b,c) | c <- [1..10], b <- [1..10], a <- [1..10], a^2 + b^2 == c^2, a > b , c > a]

Gives result almost within no time.

rightTriangles100 = [ (a,b,c) | c <- [1..100], b <- [1..100], a <- [1..100], a^2 + b^2 == c^2, a > b , c > a]

Gives result in few minutes.

rightTriangles1000 = [ (a,b,c) | c <- [1..1000], b <- [1..1000], a <- [1..1000], a^2 + b^2 == c^2, a > b , c > a]

Well, the number of triples to check is cubic in the limit, so increasing the limit by a factor of 10 increases the number of triples to check by a factor of 1000, the factor for the running time is about the same, it may be slightly larger due to the larger memory requirements. So with not even compiled, let alone optimised code,

ghci> [ (a,b,c) | c <- [1..100], b <- [1..100], a <- [1..100], a^2 + b^2 == c^2, a > b , c > a]
[(4,3,5),(8,6,10),(12,5,13),(12,9,15),(15,8,17),(16,12,20),(24,7,25),(20,15,25),(24,10,26)
,(21,20,29),(24,18,30),(30,16,34),(28,21,35),(35,12,37),(36,15,39),(32,24,40),(40,9,41)
,(36,27,45),(48,14,50),(40,30,50),(45,24,51),(48,20,52),(45,28,53),(44,33,55),(42,40,58)
,(48,36,60),(60,11,61),(63,16,65),(60,25,65),(56,33,65),(52,39,65),(60,32,68),(56,42,70)
,(55,48,73),(70,24,74),(72,21,75),(60,45,75),(72,30,78),(64,48,80),(80,18,82),(84,13,85)
,(77,36,85),(75,40,85),(68,51,85),(63,60,87),(80,39,89),(72,54,90),(84,35,91),(76,57,95)
,(72,65,97),(96,28,100),(80,60,100)]
(2.64 secs, 2012018624 bytes)

the expected time for the limit 1000 is about 45 minutes. Using a few constraints on the data, we can do it much faster:

ghci> length [(a,b,c) | c <- [2 .. 1000], b <- [1 .. c-1], a <- [c-b+1 .. b], a*a + b*b == c*c]
881
(87.28 secs, 26144152480 bytes)

What are performance (time to process) affecting factors within a list comprehension?

Answers (1)

Related Questions