Reputation: 620
I'm writing a function in Clojure to estimate the in-memory size of a parsed JSON, something like:
(defn object-size
[object]
(cond
(sequential? object)
(reduce + (map object-size object))
(map? object)
(reduce
(fn [total [k v]]
(+ total (keyword-size k) (object-size v)))
0
object)
:else
(case (type object)
java.lang.Long 8
java.lang.Double 8
java.lang.String (* 2 (count object))
;; other data types
)))
Obviously I'll need to add in overheads for clojure.lang.PersistentVector
, java.lang.String
, etc.
However, I'm not sure how to find the in-memory size of a clojure.lang.Keyword
, the keyword-size
function in the above example. How does Clojure store keywords? Are they constant size similar to a C++ enum
, or are they a special case of java.lang.String
that are dependent on length?
Upvotes: 3
Views: 501
Reputation: 92117
Answering this question from within Clojure is basically impossible. Your first-draft function works okay for the very simplest data structures, although even this simplest attempt has several errors already.
But more than that, it is just an ill-framed question. What is the size of xs
in this snippet?
(def xs (let [forever (promise)]
(deliver forever
(lazy-seq (cons 1 @forever)))
@forever))
user=> (take 5 xs)
(1 1 1 1 1)
xs
is an infinitely long sequence (so your reduce will never complete, but if it could it would surely return "this is infinite"). But it actually takes a small, fixed amount of memory, because it is circular.
You may say, well gee this is a dumb object, I don't mind if my function fails for objects like that. But in a garbage-collected language with pervasive laziness, cases with similar characteristics are commonplace. If you rule them out, you rule out everything interesting.
Upvotes: 3