Vinod
Vinod

Reputation: 4352

Which is the fastest way to convert an integer to a byte array in Julia

Question 1:Which is the fastest way to convert an integer to byte array?

a = 1026
aHexStr = string(a,base = 16,pad = 4) #2 bytes, 4 chars
b = zeros(UInt8,2)
k = 1
for i in 1:2:4
  b[k] = parse(UInt8,aHexStr[i:i+1],base = 16)
  k += 1
end

Is this method the fastest?

Related Question 2: Which is the fastest way to convert a hexadecimal string to byte array?

I have a string of hexadecimal numbers

a = "ABCDEF12345678"

How can I convert this hex string to byte array?

b = zeros(UInt8,7)
k = 1
for i in 1:2:14
  b[k] = parse(UInt8,a[i:i+1],base = 16)
  k += 1
end

Is this method the fastest?

Upvotes: 2

Views: 1293

Answers (2)

Sundar R
Sundar R

Reputation: 14695

For the first question, you can reinterpret the bytes if you're ok with additional 0 values: reinterpret(UInt8, [a]). This performs slightly faster than the code in Bogumił Kamiński's answer, by about 5-10% - but it's a difference of a few nanoseconds. So if the extra 0's are a bother, it might not be worth it.

Edit: You can drop the extra zeros by doing it as:

julia> bytesfromint(i::Int64) =
         @inbounds @view reinterpret(UInt8, [i])[1:8-leading_zeros(i)>>3]

This seems to be faster than the method mentioned in Bogumil Kaminsky's answer too, by 20% or so.

@inbounds @view reinterpret(UInt8, [ai])[(8-leading_zeros(ai)>>3):-1:1] will give you the most significant bytes first (big-endian like), or @inbounds @view reinterpret(UInt8, [ai])[2:-1:1] if you know your data will only take up 2 bytes.

(@view tells Julia to not make a copy of the part of the array we've asked for, instead indexing into the original array itself - hence avoiding the copying overhead. @inbounds assures Julia that our indices are within the bounds of the array - hence avoiding the bounds-checking overhead.)

Upvotes: 5

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69829

For the first operation I assume that you want to keep only as many bytes as there are set in your integer, so you could do:

julia> a = 1026
1026

julia> [(a>>((i-1)<<3))%UInt8 for i in 1:sizeof(a)-leading_zeros(a)>>3]
2-element Vector{UInt8}:
 0x02
 0x04

Explanation:

  • leading_zeros(a) get number of zero bits that a starts with
  • leading_zeros(a)>>3 compute number of bytes that are fully empty (>>3 is shifitng the number by 3 bits right; in this case floor division by 8)
  • sizeof(a)-leading_zeros(a)>>3 compute number of bytes that are to be converted
  • (i-1)<<3) compute number of bits we need to shift the index (in this case it is i-1 times 8)
  • (a>>((i-1)<<3))%UInt8 get the i-1th byte of a

For the second operation I assume that if you have an odd number of characters we do fill the remaining part of the last byte with 0 bits + that we do not need to check if the passed data is valid:

julia> a = "ABCDEF12345678"
"ABCDEF12345678"

julia> function s2b(a::String)
           b = zeros(UInt8, (sizeof(a) + 1) >> 1)
           for (i, c) in enumerate(codeunits(a))
               b[(i+1)>>1] |= (c - (c < 0x40 ? 0x30 : 0x37))<<(isodd(i)<<2)
           end
           return b
       end
s2b (generic function with 1 method)

julia> s2b(a)
7-element Vector{UInt8}:
 0xab
 0xcd
 0xef
 0x12
 0x34
 0x56
 0x78

Both methods should be fast, but it is hard to guarantee they are fastest possible.


EDIT

Benchmarks:

julia> function f1(a)
           aHexStr = string(a,base = 16,pad = 4) #2 bytes, 4 chars
           b = zeros(UInt8,2)
               k = 1
           for i in 1:2:4
               b[k] = parse(UInt8,aHexStr[i:i+1],base = 16)
               k += 1
           end
           return b
       end
f1 (generic function with 1 method)

julia> f2(a) = [(a>>((i-1)<<3))%UInt8 for i in 1:sizeof(a)-leading_zeros(a)>>3]
f2 (generic function with 1 method)

julia> using BenchmarkTools

julia> a = 1026
1026

julia> @btime f1($a)
  141.795 ns (5 allocations: 224 bytes)
2-element Vector{UInt8}:
 0x04
 0x02

julia> @btime f2($a)
  29.317 ns (1 allocation: 64 bytes)
2-element Vector{UInt8}:
 0x02
 0x04

julia> function s2b(a::String)
           b = zeros(UInt8, (sizeof(a) + 1) >> 1)
           for (i, c) in enumerate(codeunits(a))
               b[(i+1)>>1] |= (c - (c < 0x40 ? 0x30 : 0x37))<<(isodd(i)<<2)
           end
           return b
       end
s2b (generic function with 1 method)

julia> a = "ABCDEF12345678"
"ABCDEF12345678"

julia> @btime hex2bytes($a)
  50.000 ns (1 allocation: 64 bytes)
7-element Vector{UInt8}:
 0xab
 0xcd
 0xef
 0x12
 0x34
 0x56
 0x78

julia> @btime s2b($a)
  48.830 ns (1 allocation: 64 bytes)
7-element Vector{UInt8}:
 0xab
 0xcd
 0xef
 0x12
 0x34
 0x56
 0x78

As @SundarR commented in the latter case hex2bytes should be used. I have forgotten that it exists.

Upvotes: 3

Related Questions