Alex
Alex

Reputation: 15

Clojure Iterate through JSON and update individual keys in map

I'm new to Clojure, after trying multiple methods I'm completely stuck. I know how to achieve this in any other imperative languages, but not in Clojure.

I have a JSON file https://data.nasa.gov/resource/y77d-th95.json containing meteor fall data, each fall includes a mass and year.

I'm trying to find which year had the greatest collective total mass of falls.

Here's what I have so far:

(def jsondata
(json/read-str (slurp "https://data.nasa.gov/resource/y77d-th95.json") :key-fn keyword))

;Get the unique years
(def years (distinct (map :year jsondata)))
;Create map of unique years with a number to hold the total mass
(def yearcount (zipmap years (repeat (count years) 0)))

My idea was to use a for function to iterate through the jsondata, and update the yearcount map with the corresponding key (year in the fall object) with the mass of the object (increment it by, as in += in C)

I tried this although I knew it probably wouldn't work:

(for [x jsondata]
    (update yearcount (get x :year) (+ (get yearcount (get x :year)) (Integer/parseInt (get x :mass)))))

The idea of course being that the yearcount map would hold the totals for each year, on which I could then use frequencies, sort-by, and last to get the year with the highest mass.

Also defined this function to update values in a map with a function, although Im not sure if I could actually use this:

(defn map-kv [m f]
  (reduce-kv #(assoc %1 %2 (f %3)) {} m))

I've tried a few different methods, had lots of issues and just can't get anywhere.

Upvotes: 0

Views: 974

Answers (3)

Ivan Grishaev
Ivan Grishaev

Reputation: 1681

Here is my solution. I think you'll like it because its parts are decoupled and are not joined into a single treading macro. So you may change and test any part of it when something goes wrong.

Fetch the data:

(def jsondata
  (json/parse-string
   (slurp "https://data.nasa.gov/resource/y77d-th95.json")
   true))

Pay attention, you may just pass true flag that indicates the keys should be keywords rather than strings.

Declare a helper function that takes into account a case when the first argument is missing (is nil):

(defn add [a b]
  (+ (or a 0) b))

Declare a reduce function that takes a result and an item from a collection of meteor data. It updates the result map with our add function we created before. Please note, some items do not have either mass or year keys; we should check them for existence before operate on them:

(defn process [acc {:keys [year mass]}]
  (if (and year mass)
    (update acc year add (Double/parseDouble mass))
    acc))

The final step is to run reducing algorithm:

(reduce process {} jsondata)

The result is:

{"1963-01-01T00:00:00.000" 58946.1,
 "1871-01-01T00:00:00.000" 21133.0,
 "1877-01-01T00:00:00.000" 89810.0,
 "1926-01-01T00:00:00.000" 16437.0,
 "1866-01-01T00:00:00.000" 559772.0,
 "1863-01-01T00:00:00.000" 33710.0,
 "1882-01-01T00:00:00.000" 314462.0,
 "1949-01-01T00:00:00.000" 215078.0,

I think that such a step-by-step solution is much more clearer and maintainable than a single huge ->> thread.

Upvotes: 1

jas
jas

Reputation: 10865

Here's an alternate version just to show an approach with a slightly different style. Especially if you're new to clojure it may be easier to see the stepwise thinking that led to the solution.

The tricky part might be the for statement, which is another nice way to build up a new collection by (in this case) applying functions to each key and value in an existing map.

(defn max-meteor-year [f]
  (let [rdr (io/reader f)
        all-data (json/read rdr :key-fn keyword)
        clean-data (filter #(and (:year %) (:mass %)) all-data)
        grouped-data (group-by #(:year %) clean-data)
        reduced-data
        (for [[k v] grouped-data]
          [(subs k 0 4) (reduce + (map #(Double/parseDouble (:mass %)) v))])]
    (apply max-key second reduced-data)))

clj.meteor> (max-meteor-year "meteor.json")
["1947" 2.303023E7]

Upvotes: 1

Taylor Wood
Taylor Wood

Reputation: 16194

Update: sorry, I misunderstood the question. I think this will work for you:

(->> (group-by :year jsondata)
     (reduce-kv (fn [acc year recs]
                  (let [sum-mass (->> (keep :mass recs)
                                      (map #(Double/parseDouble %))
                                      (reduce +))]
                    (assoc acc year sum-mass)))
                {})
     (sort-by second)
     (last))
=> ["1947-01-01T00:00:00.000" 2.303023E7]

The reduce function here is starting out with an initial empty map, and its input will be the output of group-by which is a map from years to their corresponding records.

For each step of reduce, the reducing function is receiving the acc map we're building up, the current year key, and the corresponding collection of recs for that year. Then we get all the :mass values from recs (using keep instead of map because not all recs have a mass value apparently). Then we map over that with Double/parseDouble to parse the mass strings into numbers. Then we reduce over that to sum all the masses for all the recs. Finally we assoc the year key to acc with the sum-mass. This outputs a map from years to their mass sums.

Then we can sort those map key/value pairs by their value (second returns the value), then we take the last item with the highest value.

Upvotes: 0

Related Questions