SCdF
SCdF

Reputation: 59428

How would you combined sort-by and filter in Clojure?

Let's say I have a collection of strings, and I want to return all strings over 4 characters long, sorted by shortest string first.

You could solve that with something along the lines of:

(def strings ["this" "is" "super" "cool" "everybody" "isn't" "clojure" "great!?!?"])
(sort-by count < (filter #(> (count %) 4) strings))
;; > ("super" "isn't" "clojure" "everybody" "great!?!?")

Notice we're using count twice. This is probably fine here, but what if count wasn't count? What if instead of count we called super-expensive-function that we'd really rather not run more than absolutely necessary?

So:

Is there an existing function that does this, or do I need to build my own?

Upvotes: 4

Views: 566

Answers (3)

A. Webb
A. Webb

Reputation: 26466

You can pair-up using group-by and do the aggregation and filtering with list comprehension.

(for [[c sv] (sort-by first (group-by count strings)) :when (> c 4) s sv] s)

Upvotes: 0

claj
claj

Reputation: 5412

Maybe the JIT-compiler can figure out that this expensive, intermediate result is cacheable between the two operations? It's worth a try to rule this possibility out given the increased complexity in manually caching the result. I would run performance test a few times with for various solutions with time measurement like this:

(time (dotimes [_ 1e5] ...))

Upvotes: 1

Michał Marczyk
Michał Marczyk

Reputation: 84379

The simplest solution would be to pair up each item together with its expensive-to-compute property, then filter and sort, then discard the parameter:

(->> strings
     (map (juxt identity count))
     (filter (fn [[_ c]] (> c 4)))
     (sort-by peek)
     (map first))

If computing the property in question is really all that expensive, the overhead of allocating the vectors should pretty much vanish.

Upvotes: 8

Related Questions