Reputation: 9031
I tried to find a list of perfect number in this way:
(defn classify [num]
(let [factors (->> (range 1 (inc num))
(filter #(zero? (rem num %))))
sum (reduce + factors)
aliquot-sum (- sum num)]
(cond
(= aliquot-sum num) :perfect
(> aliquot-sum num) :abundant
(< aliquot-sum num) :deficient)))
(defn is-p [n]
(= :perfect (classify n)))
(defn list-perfect [n]
(filter is-p (range 1 (inc n))))
Question:
How to build a lazy sequence of perfect numbers, so that I can use (take n ...)
to easily get a list.
Is this code idiomatic and efficient? Any improvement?
Thanks in advance.
Upvotes: 1
Views: 347
Reputation: 9886
Your algorithm is very inefficient, it's O(n)
.
For a quick win, you can immediately reduce the range by a half, as you won't ever have factors that are greater than the number you're testing divided by 2.
So change it to:
(defn classify [num]
(let [factors (->> (range 1 (max 2 (inc (quot num 2))))
;; ...
However... you can change it to O(sqrt n)
which is magnitudes faster. See my timings below.
The real efficiency is noticing that factors are in pairs of [x (quot num x)]
and then only check the first (sqrt num)
(or slightly over):
(defn perfect? [n]
(let [r (range 2 (+ 2 (int (Math/sqrt n))))
s (set (mapcat #(if (zero? (rem n %))
[% (quot n %)])
r))
t (reduce + 1 s)]
(= n t)))
I've split it into separate calculations so you can verify each stage.
The range can be reduced from 2..((sqrt n) + 2), and initialise the reduction with 1 (which is always a factor).
This changes the problem from an O(n) to O(sqrt n), so if you're checking large numbers, makes a vast difference.
As an illustration here are some times on larger values for n on my MBP:
n "n/2" "sqrt n"
33550336 1,172.5ms 2.85ms
8589869056 274,346.6ms 16.76ms
137438691328 didn't time 44.27ms
so using root version was 16,369 times faster for the 6th perfect number. See http://en.wikipedia.org/wiki/List_of_perfect_numbers for more details.
EDIT:
Why (int (root n)) + 2? And why `[x (quot n x)]?
When you work out the factors of a number n, then if you find one (say, a
), then n/a
is also a factor (call it b
) because n = a * b
e.g. looking at 28, the first relevant factor is 2, and clearly 28/2 = 14 is also a factor. So you don't need to check 14, you already know it's a factor from the fact that 2 is.
as we're incrementally checking numbers from 2 upwards, we're incidentally finding the higher numbers coming down:
2 is a factor, 28 / 2 = 14 -> [a, b] = [2, 14]
4 is a factor, 28 / 4 = 7 -> [a, b] = [4, 7]
7 is a factor, 28 / 7 = 4 -> [a, b] = [7, 4] - wait a minute...
The [a,b]
pairs here are the [% (quot n %)]
in the mapcat function, e.g. when the range is currently iterating the value 2
, then %
is 2
inside the fuction, and so (quot n %)
is (quot 28 2)
which is 14, thus [% (quot n %)]
is simply the vector [2 14]
, which then gets added to the set after being flattened to 2
and 14
as values. Later, when the range value is 4
, then [% (quot n %)]
is [4 (quot 28 4)]
which is [4 7]
, and again is flattened by mapcat as the numbers 4
and 7
.
So we add each pair of numbers (flattened via mapcat) to our set, include the number 1, and end up with #{1 2 14 4 7}
, which are the factors of 28. (Actually, I don't put 1 in the set as I don't need to, instead I start the summing reduction at 1, which is same effect).
But at what point do they turn around? i.e. when do we know that [7,4]
will already have been included in the set as [4,7]
?
clearly it's when a
> b
, because in finding the lowest numbers we always find the highest number with it, so we can finish checking at this point.
but what is this point? it's simple, if a perfect number were a square number, then a and b would be equal, i.e. a*a = n, so a = sqrt(n).
Thus the highest value of a
we need to check is the whole number that is larger than the root of n
.
e.g. for 28, sqrt(28) = 5.292, so we have to check 6 to be sure that we've included the lowest number possible that could be a factor that has a paired factor.
so we need (int (sqrt n)) + 1.
I always do this in case the root calculation is 1.9999999999... and rounds wrong way, so adding 1 more ensures you eliminate any rounding errors.
but in a range, if you want to include that number you have to add 1 more to it (range drops the high number, (range 6) = (0 1 2 3 4 5)), hence why it adds 2 to the value: 1 for the range, 1 to ensure it's above the rounded down root.
Although, after saying this, I've tested perfect numbers up to 2305843008139952128 and it works with +1 instead of +2, not that it's a massive saving. Probably because non of the perfect numbers are close to perfect squares in ones I checked, so there's no rounding error in (int (sqrt n))
.
If you're interested in perfect numbers, I'd recommend reading http://britton.disted.camosun.bc.ca/perfect_number/lesson1.html
Upvotes: 2
Reputation: 3504
list-perfect
is already lazy due to your usage of filter:
(filter pred coll)
Returns a lazy sequence of the items in coll for which (pred item) returns true. pred must be free of side-effects.
Whether code is idiomatic or not might be a matter of opinion (and hence off-topic), but it looks good enough from my perspective.
Upvotes: 1