Rohan Almeida
Rohan Almeida

Reputation: 183

improve this Clojure code to get more idiomatic

Any way I can get more idiomatic Clojure with the following code. I know I'm missing something regarding destructuring. At least I can say that I understand the current code. My first temptation was to use doseq and then populate a hash-map, but I was sure that map was the solution.

The code reads a CSV file -

Name,Department,Score
Rohan,IT,8
Bob,Sales,6
Tom,IT,9
Jane,Accounting,3
Mary,Sales,9
Harry,IT,8
Frodo,Marketing,8
Bilbo,Accounting,10

and will output rows sorted by highest score. Simple!

(def file "scores.csv")

(defn list-of-vecs []
  (let [file-str (slurp file)]
    (let [lines (clojure.string/split-lines file-str)]
      (next (map #(clojure.string/split % #",") lines)))))

(defn list-of-maps []
    (map (fn [n] {:name (n 0), :department (n 1), :score (Integer/parseInt (n 2))})
        (list-of-vecs)))

(defn sorted-list []
  (reverse (sort-by :score (list-of-maps))))

(defn print-high-scores []
  (prn "Name","Department","Score")
  (map (fn [m] (prn (m :name) (m :department) (m :score))) (sorted-list)))

Any feedback would be appreciated, including on indentation.

Also I was very surprised by the performance (running inside lein).

CSV file of 8 lines

parse-csv.core=> (time print-high-scores)
 "Elapsed time: 0.026059 msecs"

CSV file of 25k lines

parse-csv.core=> (time print-high-scores)
"Elapsed time: 0.025636 msecs"

Upvotes: 0

Views: 96

Answers (1)

birdspider
birdspider

Reputation: 3074

does your (time print-high-scores) actually print anything ?

Or am I using time incorrectly:

I think you are useing it correctly but measuring the wrong thing.

my approach:

; read file - drop header line
(def input 
  (rest (line-seq (clojure.java.io/reader "inputfilename"))))

; top ten
(def top-ten 
  (take 10 (time (sort-by 
              #(- (Integer/parseInt (nth % 2))) ; negate so highest first
              (map (fn [line] 
                     (clojure.string/split line #",")) input)))))
; 10 lines  "Elapsed time:  0.469539 msecs"
; 25k lines "Elapsed time: 68.157863 msecs"

; print - sideeffect
(time (doseq [e (doall top-ten)] 
  (print e "\n")))

"Elapsed time: 0.02804 msecs"
[Bilbo Accounting 10]
[Tom IT 9]
[Mary Sales 9]
[Rohan IT 8]
[Harry IT 8]
[Frodo Marketing 8]
[Bob Sales 6]
[Jane Accounting 3]
nil

Upvotes: 1

Related Questions