Reputation: 1331
I have a 700 mb XML file that I process from a records tree to an EDN file.
After having do all the processing, I finally have a lazy sequence of hashmaps that are not particularely big (at most 10 values).
To finish, I want to write it to a file with
(defn write-catalog [catalog-edn]
(with-open [wrtr (io/writer "catalog-fr.edn")]
(doseq [x catalog-edn]
(.write wrtr (prn-str x)))))
I do not understand the problem because doseq
is supposed to do not retain the head of the sequence in memory.
My final output catalog
is of type clojure.lang.LazySeq
.
I then do
(write-catalog catalog)
Then memory usage is grinding and I have a GC overhead error at around 80mb of file writter with a XmX of 3g.
I tried also with a doseq
+ spit
and no prn-str
, same thing happen.
Is this a normal behaviour ?
Thanks
Upvotes: 2
Views: 87
Reputation: 17859
Possibly the memory leaks due to the catalog
values realization (google "head retention"). When your write-catalog
realizes items one by one, they are kept in memory (obviously you're def
'fing catalog
somewhere). To fix this you may try to avoid keeping your catalog in a variable, instead pass it to the write-catalog
at once. Like if you parse it from somewhere (which i guess is true, considering your previous question), you would want to do:
(write-catalog (transform-catalog (get-catalog "mycatalog.xml")))
so huge intermediate sequences won't eat all your memory
Hope it helps.
Upvotes: 2