mac
mac

Reputation: 10075

Decoding base64 encoded file back to original format using Clojure

How does one convert a file that has been base64 encoded back to its original format and write it to disk? For instance I have a pdf file which has been mime64 encoded. The file starts with:

data:application/pdf;base64,JVBER

I would like to write this out to disk in the proper format. I have tried several libraries (e.g. ring.util.codec) that decode the string into a byte-array, but if I write the resulting byte-array out to a file (using spit) the file appears corrupted.

UPDATE:

The PHP function base64_decode appears to be doing what I am looking for, as it returns a string. What is the equivalent in Java?

Upvotes: 2

Views: 7503

Answers (2)

nha
nha

Reputation: 18005

In Clojure, there is data.codec (formerly in clojure-contrib).

Using Java interoperability :

So those are the helper functions I used for images when using data.codec :

(require '[clojure.data.codec.base64 :as b64-codec])

(defn write-img! [id b64]
  (clojure.java.io/copy
   (decode-str (chop-header b64))
   (java.io.File. (str "/Users/nha/tmp/" id "." (b64-ext b64)))))

(defn decode-str [s]
  (b64-codec/decode (.getBytes s)))

(defn in?
  "true if the seq coll contains the element el"
  [coll el]
  (some #(= el %) coll))

(defn b64-ext [s]
  (if-let [ext (second (first (re-seq #"data:image/(.*);base64.*" s)))]
    (if (in? ["png" "jpeg"] ext)
      ext
      (throw (Exception. (str "Unsupported extension found for image " ext))))
    (throw (Exception. (str "No extension found for image " s)))))

(defn chop-header [s]
  (nth (first (re-seq #"(data:image/.*;base64,)(.*)" s)) 2))

Upvotes: 3

Nicolas Modrzyk
Nicolas Modrzyk

Reputation: 14187

Any java library should work (here's one, from Apache Commons, here's one totally in Clojure from Clojure-contrib

I suspect the content is modified somehow, meaning bytes may be converted to string using some encoding, and then trying to read this string back to bytes using a different encoding.

The first step may be to check you have the exact same number of bytes in the file on the server side, and the file you are trying to read. Also, try to confirm the checksum (MD5) is the same.

In any case, a PDF file is a binary file, so you should NOT convert it to string anywhere, but straight bytes.

Upvotes: 3

Related Questions