Reputation: 15
I'm new to Clojure, after trying multiple methods I'm completely stuck. I know how to achieve this in any other imperative languages, but not in Clojure.
I have a JSON file https://data.nasa.gov/resource/y77d-th95.json containing meteor fall data, each fall includes a mass and year.
I'm trying to find which year had the greatest collective total mass of falls.
Here's what I have so far:
(def jsondata
(json/read-str (slurp "https://data.nasa.gov/resource/y77d-th95.json") :key-fn keyword))
;Get the unique years
(def years (distinct (map :year jsondata)))
;Create map of unique years with a number to hold the total mass
(def yearcount (zipmap years (repeat (count years) 0)))
My idea was to use a for function to iterate through the jsondata, and update the yearcount map with the corresponding key (year in the fall object) with the mass of the object (increment it by, as in += in C)
I tried this although I knew it probably wouldn't work:
(for [x jsondata]
(update yearcount (get x :year) (+ (get yearcount (get x :year)) (Integer/parseInt (get x :mass)))))
The idea of course being that the yearcount map would hold the totals for each year, on which I could then use frequencies, sort-by, and last to get the year with the highest mass.
Also defined this function to update values in a map with a function, although Im not sure if I could actually use this:
(defn map-kv [m f]
(reduce-kv #(assoc %1 %2 (f %3)) {} m))
I've tried a few different methods, had lots of issues and just can't get anywhere.
Upvotes: 0
Views: 974
Reputation: 1681
Here is my solution. I think you'll like it because its parts are decoupled and are not joined into a single treading macro. So you may change and test any part of it when something goes wrong.
Fetch the data:
(def jsondata
(json/parse-string
(slurp "https://data.nasa.gov/resource/y77d-th95.json")
true))
Pay attention, you may just pass true
flag that indicates the keys should be keywords rather than strings.
Declare a helper function that takes into account a case when the first argument is missing (is nil):
(defn add [a b]
(+ (or a 0) b))
Declare a reduce function that takes a result and an item from a collection of meteor data. It updates the result map with our add
function we created before. Please note, some items do not have either mass
or year
keys; we should check them for existence before operate on them:
(defn process [acc {:keys [year mass]}]
(if (and year mass)
(update acc year add (Double/parseDouble mass))
acc))
The final step is to run reducing algorithm:
(reduce process {} jsondata)
The result is:
{"1963-01-01T00:00:00.000" 58946.1,
"1871-01-01T00:00:00.000" 21133.0,
"1877-01-01T00:00:00.000" 89810.0,
"1926-01-01T00:00:00.000" 16437.0,
"1866-01-01T00:00:00.000" 559772.0,
"1863-01-01T00:00:00.000" 33710.0,
"1882-01-01T00:00:00.000" 314462.0,
"1949-01-01T00:00:00.000" 215078.0,
I think that such a step-by-step solution is much more clearer and maintainable than a single huge ->>
thread.
Upvotes: 1
Reputation: 10865
Here's an alternate version just to show an approach with a slightly different style. Especially if you're new to clojure it may be easier to see the stepwise thinking that led to the solution.
The tricky part might be the for
statement, which is another nice way to build up a new collection by (in this case) applying functions to each key and value in an existing map.
(defn max-meteor-year [f]
(let [rdr (io/reader f)
all-data (json/read rdr :key-fn keyword)
clean-data (filter #(and (:year %) (:mass %)) all-data)
grouped-data (group-by #(:year %) clean-data)
reduced-data
(for [[k v] grouped-data]
[(subs k 0 4) (reduce + (map #(Double/parseDouble (:mass %)) v))])]
(apply max-key second reduced-data)))
clj.meteor> (max-meteor-year "meteor.json")
["1947" 2.303023E7]
Upvotes: 1
Reputation: 16194
Update: sorry, I misunderstood the question. I think this will work for you:
(->> (group-by :year jsondata)
(reduce-kv (fn [acc year recs]
(let [sum-mass (->> (keep :mass recs)
(map #(Double/parseDouble %))
(reduce +))]
(assoc acc year sum-mass)))
{})
(sort-by second)
(last))
=> ["1947-01-01T00:00:00.000" 2.303023E7]
The reduce function here is starting out with an initial empty map, and its input will be the output of group-by
which is a map from years to their corresponding records.
For each step of reduce, the reducing function is receiving the acc
map we're building up, the current year
key, and the corresponding collection of recs
for that year. Then we get all the :mass
values from recs
(using keep
instead of map
because not all recs
have a mass value apparently). Then we map over that with Double/parseDouble
to parse the mass strings into numbers. Then we reduce
over that to sum all the masses for all the recs
. Finally we assoc
the year
key to acc
with the sum-mass
. This outputs a map from years to their mass sums.
Then we can sort those map key/value pairs by their value (second
returns the value), then we take the last item with the highest value.
Upvotes: 0