Reputation: 186
I opened this issue on github project prevayler-clj
https://github.com/klauswuestefeld/prevayler-clj/issues/1
because 1M short vectors, like this [:a1 1]
, forming the state of the prevayler, results in 1GB file size when serialized, one by one, with Java writeObject.
Is it possible? About 1kB for each PersistentVector? Further investigations demonstrated the same amount of vectors can be serialized in a 80MB file. So, what's going wrong in prevayler serialization? Am I doing something wrong in these tests. Please refer to the github issue for my tests code excerpts.
Upvotes: 1
Views: 169
Reputation: 200236
Prevayler apparently starts a fresh ObjectOutputStream
for each serialized element, preventing any reuse of class data between them. Your test code, on the other hand, is written the "natural" way, allowing reuse. What forces Prevayler to restart every time is not clear to me, but I would hesitate to call it a "feature", given the negative impact it has; "workaround" is the more likely designation.
Upvotes: 1
Reputation: 17771
There's nothing wrong with prevLayer per say. It's just that java's writeObject method is not exactly tuned to writing clojure data; it's intended to store the internal structure of any serializable java object. Since clojure vectors are reasonably complex java objects under the hood, I'm not very suprised that a small vector may write out as roughly a Kb of data.
I'd guess that pretty much any clojure-specific serialization method would result in smaller files. From experience, standard clojure.core/pr
+ clojure.core/read
gives a good balance between file size and speed and handles data structures of nearly any size.
See these pages for some insight in the internals of clojure vectors:
Upvotes: 1