Abhishek Pal
Abhishek Pal

Reputation: 141

Converting bytes to string in clojure

I am converting a byte array in clojure into a string and I am facing a weird problem and I don't know the reason why.

The byte array is defined as

(def a (byte-array  (byte 72) (byte 105)))

when I convert it into a vector it shows

(vec a)

output: (105, 105, 105, ......68 times ...., 105)

The output is an array of 105 with 72 elements. Why is it so? I am trying to convert a byte array into a string and using

(String. (byte-array (byte 72) (byte 105)))

yields 72 i's. On the other hand when I do

(map char [(byte 72) (byte 105)])

I get the output H and I. That's why I am trying to convert the byte array into a vector. If there is an alternate way of doing it please let me know.

Upvotes: 3

Views: 1620

Answers (2)

Alan Thompson
Alan Thompson

Reputation: 29958

@cfrick has already answered how to properly construct a byte array in Clojure.

As a convenience, there are many string and character functions available in the Tupelo library that may be useful. These work in both Clojure and ClojureScript, which have different notions of what a "String" is. See the following:

The following code shows some of the ways you can manipulate values of type java.lang.String, java.lang.Character, Byte, Long, and Java byte array:

(ns tst.demo.core
  (:use tupelo.test)
  (:require
    [cambium.core :as log]
    [clojure.string :as str]
    [tupelo.core :as t] ))

(dotest
  (let [chars-vec (vec "Hi") ; a vector of Character vals
        byte-vec  (mapv byte chars-vec) ; a vector of Byte vals
        long-vec  (mapv long chars-vec) ; a vector of Long vals

        ; Any sequence of numeric values is acceptable to the `byte-array` function. 
        ; The function `tupelo.core/char->codepoint` works in both CLJ and CLJS
        ba-nums   (byte-array (mapv t/char->codepoint chars-vec))
        ba-longs  (byte-array long-vec)

        ; a sequence of Characters can be made into a String in 2 ways
        str-0     (apply str chars-vec)
        str-1     (str/join chars-vec)

        ; Whether we have a vector or a byte-array, the values must be converted into
        ; a sequence of Characters before using `(apply str ...)` of `(str/join ...)`
        ; to construct a String object.
        str-2     (str/join (mapv char byte-vec))
        str-3     (str/join (mapv t/codepoint->char long-vec))
        str-4     (str/join (mapv char ba-nums))
        str-5     (str/join (mapv t/codepoint->char ba-longs))]

print the results:

    (disp-types chars-vec)
    (disp-types byte-vec)
    (disp-types long-vec)
    (disp-types ba-nums)
    (disp-types ba-longs)

    (println "result type: " (type str-0))

All of the above produce the same result "Hi"

    (is= "Hi"
      str-0
      str-1
      str-2
      str-3
      str-4
      str-5)))

with result

-------------------------------
   Clojure 1.10.1    Java 13
-------------------------------

Testing tst.demo.core
chars-vec   type:   clojure.lang.PersistentVector   value:   [H i]      content types:  [java.lang.Character java.lang.Character]
byte-vec    type:   clojure.lang.PersistentVector   value:   [72 105]   content types:  [java.lang.Byte java.lang.Byte]
long-vec    type:   clojure.lang.PersistentVector   value:   [72 105]   content types:  [java.lang.Long java.lang.Long]
ba-nums     type:   [B   value:   #object[[B 0x24a2bb25 [B@24a2bb25]    content types:  [java.lang.Byte java.lang.Byte]
ba-longs    type:   [B   value:   #object[[B 0x2434f548 [B@2434f548]    content types:  [java.lang.Byte java.lang.Byte]

result type:  java.lang.String


Ran 2 tests containing 1 assertions.
0 failures, 0 errors.

And all of the results are of type java.lang.String.


For completeness, here is the display code:

(defn disp-types-impl
  [item]
  `(do
    (println '~item "  type:  " (type ~item) "  value:  " ~item
      "  content types: " (mapv type ~item))))

(defmacro disp-types
  [item]
  (disp-types-impl item))

Upvotes: 1

cfrick
cfrick

Reputation: 37008

You are calling the two-arity version and therefor your first argument sets the size of the array to be created and your second argument is no sequence so it is considered the init-val; see:

user=> (doc byte-array)
-------------------------
clojure.core/byte-array
([size-or-seq] [size init-val-or-seq])
  Creates an array of bytes

Also the initial values are taken from a sequence (as the argument name suggests). So you can do:

user=> (String. (byte-array [(byte 72) (byte 105)]))
"Hi"

Upvotes: 8

Related Questions