Reputation: 4672
How can you convert a byte array to a hexadecimal string and vice versa?
Upvotes: 1662
Views: 1232497
Reputation: 3344
You can use the BitConverter.ToString method:
byte[] bytes = {0, 1, 2, 4, 8, 16, 32, 64, 128, 255};
Console.WriteLine( BitConverter.ToString(bytes));
Output:
00-01-02-04-08-10-20-40-80-FF
More information: BitConverter.ToString Method (Byte[])
Upvotes: 81
Reputation: 108840
When writing crypto code it's common to avoid data dependent branches and table lookups to ensure the runtime doesn't depend on the data, since data dependent timing can lead to side-channel attacks.
It's also pretty fast.
static string ByteToHexBitFiddle(byte[] bytes)
{
char[] c = new char[bytes.Length * 2];
int b;
for (int i = 0; i < bytes.Length; i++) {
b = bytes[i] >> 4;
c[i * 2] = (char)(55 + b + (((b-10)>>31)&-7));
b = bytes[i] & 0xF;
c[i * 2 + 1] = (char)(55 + b + (((b-10)>>31)&-7));
}
return new string(c);
}
Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn
Abandon all hope, ye who enter here
An explanation of the weird bit fiddling:
bytes[i] >> 4
extracts the high nibble of a bytebytes[i] & 0xF
extracts the low nibble of a byteb - 10
< 0
for values b < 10
, which will become a decimal digit>= 0
for values b > 10
, which will become a letter from A
to F
.i >> 31
on a signed 32 bit integer extracts the sign, thanks to sign extension.
It will be -1
for i < 0
and 0
for i >= 0
.(b-10)>>31
will be 0
for letters and -1
for digits.0
, and b
is in the range 10 to 15. We want to map it to A
(65) to F
(70), which implies adding 55 ('A'-10
).b
from the range 0 to 9 to the range 0
(48) to 9
(57). This means it needs to become -7 ('0' - 55
).& -7
since (0 & -7) == 0
and (-1 & -7) == -7
.Some further considerations:
c
, since measurement shows that calculating it from i
is cheaper.i < bytes.Length
as upper bound of the loop allows the JITter to eliminate bounds checks on bytes[i]
, so I chose that variant.b
an int allows unnecessary conversions from and to byte.The same thing can be implemented using the new string.Create
function, which avoids having to allocate a separate char[]
array.
AggressiveInlining
should allow that function to disappear from the JIT.32
to get a lower-case result.Memory<byte>
instead of an array, this allows a wider range of memory buffers (including arrays).[MethodImpl(MethodImplOptions.AggressiveInlining)]
static string ByteToHexBitFiddle(Memory<byte> bytes, bool lowercase = false) =>
lowercase
? string.Create(bytes.Length * 2, bytes, LowercaseFillHex)
: string.Create(bytes.Length * 2, bytes, UppercaseFillHex);
static void UppercaseFillHex(Span<char> span, Memory<byte> mem)
{
var bytes = mem.Span;
for (int i = 0; i < bytes.Length; i++)
{
span[i * 2] = ConvertNibble(bytes[i] >> 4, 0);
span[i * 2 + 1] = ConvertNibble(bytes[i] & 0xF, 0);
}
}
static void LowercaseFillHex(Span<char> span, Memory<byte> mem)
{
var bytes = mem.Span;
for (int i = 0; i < bytes.Length; i++)
{
span[i * 2] = ConvertNibble(bytes[i] >> 4, 32);
span[i * 2 + 1] = ConvertNibble(bytes[i] & 0xF, 32);
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static char ConvertNibble(int nibble, int adjust) =>
(char)(55 + adjust + nibble + (((nibble - 10) >> 31) & (-7 - adjust)));
Upvotes: 165
Reputation: 1056
Updated on: 2022-04-17
using System;
string result = Convert.ToHexString(bytesToConvert);
The comparison from Thymine seems to be outdated and incomplete, especially after .NET 5 with its Convert.ToHexString
, so I decided to ~~fall into the bytes to hex string rabbit hole~~ create a new, updated comparison with more methods from answers to both of these two questions.
I went with BenchamrkDotNet instead of a custom-made benchmarking script, which will, hopefully, make the result more accurate.
Remember that micro-benchmarking won't ever represent the actual situation, and you should do your tests.
I ran these benchmarks on a Linux with Kernel 5.15.32 on an AMD Ryzen 5800H with 2x8 GB DDR4 @ 2133 MHz.
Be aware that the whole benchmark might take a lot of time to complete - around 40 minutes on my machine.
All methods mentioned (unless stated otherwise) focus on UPPERCASE output only. That means the output will look like B33F69
, not b33f69
.
The output from Convert.ToHexString
is always uppercase. Still, thankfully there isn't any significant performance drop when paired with ToLower()
, although both unsafe
methods will be faster if that's your concern.
Making the string lowercase efficiently might be a challenge in some methods (especially the ones with bit operators magic), but in most, it's enough to change a parameter X2
to x2
or change the letters from uppercase to lowercase in a mapping.
It is sorted by Mean N=100
. The reference point is the StringBuilderForEachByte method.
Method (means are in nanoseconds) | Mean N=10 | Ratio N=10 | Mean N=100 | Ratio N=100 | Mean N=500 | Ratio N=500 | Mean N=1k | Ratio N=1k | Mean N=10k | Ratio N=10k | Mean N=100k | Ratio N=100k |
---|---|---|---|---|---|---|---|---|---|---|---|---|
StringBuilderAggregateBytesAppendFormat | 364.92 | 1.48 | 3,680.00 | 1.74 | 18,928.33 | 1.86 | 38,362.94 | 1.87 | 380,994.74 | 1.72 | 42,618,861.57 | 1.62 |
StringBuilderForEachAppendFormat | 309.59 | 1.26 | 3,203.11 | 1.52 | 20,775.07 | 2.04 | 41,398.07 | 2.02 | 426,839.96 | 1.93 | 37,220,750.15 | 1.41 |
StringJoinSelect | 310.84 | 1.26 | 2,765.91 | 1.31 | 13,549.12 | 1.33 | 28,691.16 | 1.40 | 304,163.97 | 1.38 | 63,541,601.12 | 2.41 |
StringConcatSelect | 301.34 | 1.22 | 2,733.64 | 1.29 | 14,449.53 | 1.42 | 29,174.83 | 1.42 | 307,196.94 | 1.39 | 32,877,994.95 | 1.25 |
StringJoinArrayConvertAll | 279.21 | 1.13 | 2,608.71 | 1.23 | 13,305.96 | 1.30 | 27,207.12 | 1.32 | 295,589.61 | 1.34 | 62,950,871.38 | 2.39 |
StringBuilderAggregateBytesAppend | 276.18 | 1.12 | 2,599.62 | 1.23 | 12,788.11 | 1.25 | 26,043.54 | 1.27 | 255,389.06 | 1.16 | 27,664,344.41 | 1.05 |
StringConcatArrayConvertAll | 244.81 | 0.99 | 2,361.08 | 1.12 | 11,881.18 | 1.16 | 23,709.21 | 1.15 | 265,197.33 | 1.20 | 56,044,744.44 | 2.12 |
StringBuilderForEachByte | 246.09 | 1.00 | 2,112.77 | 1.00 | 10,200.36 | 1.00 | 20,540.77 | 1.00 | 220,993.95 | 1.00 | 26,387,941.13 | 1.00 |
StringBuilderForEachBytePreAllocated | 213.85 | 0.87 | 1,897.19 | 0.90 | 9,340.66 | 0.92 | 19,142.27 | 0.93 | 204,968.88 | 0.93 | 24,902,075.81 | 0.94 |
BitConverterReplace | 140.09 | 0.57 | 1,207.74 | 0.57 | 6,170.46 | 0.60 | 12,438.23 | 0.61 | 145,022.35 | 0.66 | 17,719,082.72 | 0.67 |
LookupPerNibble | 63.78 | 0.26 | 421.75 | 0.20 | 1,978.22 | 0.19 | 3,957.58 | 0.19 | 35,358.21 | 0.16 | 4,993,649.91 | 0.19 |
LookupAndShift | 53.22 | 0.22 | 311.56 | 0.15 | 1,461.15 | 0.14 | 2,924.11 | 0.14 | 26,180.11 | 0.12 | 3,771,827.62 | 0.14 |
WhilePropertyLookup | 41.83 | 0.17 | 308.59 | 0.15 | 1,473.10 | 0.14 | 2,925.66 | 0.14 | 28,440.28 | 0.13 | 5,060,341.10 | 0.19 |
LookupAndShiftAlphabetArray | 37.06 | 0.15 | 290.96 | 0.14 | 1,387.01 | 0.14 | 3,087.86 | 0.15 | 29,883.54 | 0.14 | 5,136,607.61 | 0.19 |
ByteManipulationDecimal | 35.29 | 0.14 | 251.69 | 0.12 | 1,180.38 | 0.12 | 2,347.56 | 0.11 | 22,731.55 | 0.10 | 4,645,593.05 | 0.18 |
ByteManipulationHexMultiply | 35.45 | 0.14 | 235.22 | 0.11 | 1,342.50 | 0.13 | 2,661.25 | 0.13 | 25,810.54 | 0.12 | 7,833,116.68 | 0.30 |
ByteManipulationHexIncrement | 36.43 | 0.15 | 234.31 | 0.11 | 1,345.38 | 0.13 | 2,737.89 | 0.13 | 26,413.92 | 0.12 | 7,820,224.57 | 0.30 |
WhileLocalLookup | 42.03 | 0.17 | 223.59 | 0.11 | 1,016.93 | 0.10 | 1,979.24 | 0.10 | 19,360.07 | 0.09 | 4,150,234.71 | 0.16 |
LookupAndShiftAlphabetSpan | 30.00 | 0.12 | 216.51 | 0.10 | 1,020.65 | 0.10 | 2,316.99 | 0.11 | 22,357.13 | 0.10 | 4,580,277.95 | 0.17 |
LookupAndShiftAlphabetSpanMultiply | 29.04 | 0.12 | 207.38 | 0.10 | 985.94 | 0.10 | 2,259.29 | 0.11 | 22,287.12 | 0.10 | 4,563,518.13 | 0.17 |
LookupPerByte | 32.45 | 0.13 | 205.84 | 0.10 | 951.30 | 0.09 | 1,906.27 | 0.09 | 18,311.03 | 0.08 | 3,908,692.66 | 0.15 |
LookupSpanPerByteSpan | 25.69 | 0.10 | 184.29 | 0.09 | 863.79 | 0.08 | 2,035.55 | 0.10 | 19,448.30 | 0.09 | 4,086,961.29 | 0.15 |
LookupPerByteSpan | 27.03 | 0.11 | 184.26 | 0.09 | 866.03 | 0.08 | 2,005.34 | 0.10 | 19,760.55 | 0.09 | 4,192,457.14 | 0.16 |
Lookup32SpanUnsafeDirect | 16.90 | 0.07 | 99.20 | 0.05 | 436.66 | 0.04 | 895.23 | 0.04 | 8,266.69 | 0.04 | 1,506,058.05 | 0.06 |
Lookup32UnsafeDirect | 16.51 | 0.07 | 98.64 | 0.05 | 436.49 | 0.04 | 878.28 | 0.04 | 8,278.18 | 0.04 | 1,753,655.67 | 0.07 |
ConvertToHexString | 19.27 | 0.08 | 64.83 | 0.03 | 295.15 | 0.03 | 585.86 | 0.03 | 5,445.73 | 0.02 | 1,478,363.32 | 0.06 |
ConvertToHexString.ToLower() | 45.66 | - | 175.16 | - | 787.86 | - | 1,516.65 | - | 13,939.71 | - | 2,620,046.76 | - |
The method ConvertToHexString
is undoubtedly the fastest out there, and in my perspective, it should always be used if you have the option - it's swift and clean.
using System;
string result = Convert.ToHexString(bytesToConvert);
If not, I decided to highlight two other methods I consider worthy below.
I decided not to highlight unsafe
methods since such code might be not only, well, unsafe, but most projects I've worked with don't allow such code.
The first one is LookupPerByteSpan
.
The code is almost identical to the code in LookupPerByte
by CodesInChaos from this answer. This one is the fastest not-unsafe
method benchmarked. The difference between the original and this one is using stack allocation for shorter inputs (up to 512 bytes). This makes this method around 10 % faster on these inputs but around 5 % slower on larger ones. Since most of the data I work with is shorter than larger, I opted for this one. LookupSpanPerByteSpan
is also very fast, but the code size of its ReadOnlySpan<byte>
mapping is too large compared to all other methods.
private static readonly uint[] Lookup32 = Enumerable.Range(0, 256).Select(i =>
{
string s = i.ToString("X2");
return s[0] + ((uint)s[1] << 16);
}).ToArray();
public string ToHexString(byte[] bytes)
{
var result = bytes.Length * 2 <= 1024
? stackalloc char[bytes.Length * 2]
: new char[bytes.Length * 2];
for (int i = 0; i < bytes.Length; i++)
{
var val = Lookup32[bytes[i]];
result[2 * i] = (char)val;
result[2 * i + 1] = (char)(val >> 16);
}
return new string(result);
}
The second one is LookupAndShiftAlphabetSpanMultiply
.
First, I would like to mention that this one is my creation. However, I believe this method is not only pretty fast but also simple to understand.
The speed comes from a change that happened in C# 7.3, where declared ReadOnlySpan<byte>
methods returning a constant array initialization - new byte {1, 2, 3, ...}
- are compiled as the program's static data, therefore omitting a redundant memory allocations. [source]
private static ReadOnlySpan<byte> HexAlphabetSpan => new[]
{
(byte)'0', (byte)'1', (byte)'2', (byte)'3',
(byte)'4', (byte)'5', (byte)'6', (byte)'7',
(byte)'8', (byte)'9', (byte)'A', (byte)'B',
(byte)'C', (byte)'D', (byte)'E', (byte)'F'
};
public static string ToHexString(byte[] bytes)
{
var res = bytes.Length * 2 <= 1024 ? stackalloc char[bytes.Length * 2] : new char[bytes.Length * 2];
for (var i = 0; i < bytes.Length; ++i)
{
var j = i * 2;
res[j] = (char)HexAlphabetSpan[bytes[i] >> 4];
res[j + 1] = (char)HexAlphabetSpan[bytes[i] & 0xF];
}
return new string(res);
}
The source code for all methods, the benchmark, and this answer can be found here as a Gist on my GitHub.
Upvotes: 39
Reputation: 1133
Expanding on the BigInteger approach (Gregory Morse mentioned it above). I can't comment on efficiency and it uses System.Linq.Reverse(), but its small and built in.
// To hex
byte[] bytes = System.Text.Encoding.UTF8.GetBytes("Test String!£");
string hexString = new System.Numerics.BigInteger(bytes.Reverse().ToArray()).ToString("x2");
// From hex
byte[] fromHexBytes = System.Numerics.BigInteger.Parse(hexString, System.Globalization.NumberStyles.HexNumber).ToByteArray().Reverse().ToArray();
// Unit test
CollectionAssert.AreEqual(bytes, fromHexBytes);
Upvotes: 0
Reputation: 4299
Here's my purely binary solution without a need for a library lookup, and also supports upper/lower case:
public static String encode(byte[] bytes, boolean uppercase) {
char[] result = new char[2 * bytes.length];
for (int i = 0; i < bytes.length; i++) {
byte word = bytes[i];
byte left = (byte) ((0XF0 & word) >>> 4);
byte right = (byte) ((byte) 0X0F & word);
int resultIndex = i * 2;
result[resultIndex] = encode(left, uppercase);
result[resultIndex + 1] = encode(right, uppercase);
}
return new String(result);
}
public static char encode(byte value, boolean uppercase) {
int characterCase = uppercase ? 0 : 32;
if (value > 15 || value < 0) {
return '0';
}
if (value > 9) {
return (char) (value + 0x37 | characterCase);
}
return (char) (value + 0x30);
}
Upvotes: 0
Reputation: 338326
You can use Convert.ToHexString
starting with .NET 5.
There's also a method for the reverse operation: Convert.FromHexString
.
For older versions of .NET you can either use:
public static string ByteArrayToString(byte[] ba)
{
StringBuilder hex = new StringBuilder(ba.Length * 2);
foreach (byte b in ba)
hex.AppendFormat("{0:x2}", b);
return hex.ToString();
}
or:
public static string ByteArrayToString(byte[] ba)
{
return BitConverter.ToString(ba).Replace("-","");
}
There are even more variants of doing it, for example here.
The reverse conversion would go like this:
public static byte[] StringToByteArray(String hex)
{
int NumberChars = hex.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
Using Substring
is the best option in combination with Convert.ToByte
. See this answer for more information. If you need better performance, you must avoid Convert.ToByte
before you can drop SubString
.
Upvotes: 1735
Reputation: 777
byte[]
(byte array) to hexadecimal string
, use:System.Convert.ToHexString
var myBytes = new byte[100];
var myString = System.Convert.ToHexString(myBytes);
string
to byte[]
, use:System.Convert.FromHexString
var myString = "E10B116E8530A340BCC7B3EAC208487B";
var myBytes = System.Convert.FromHexString(myString);
Upvotes: 26
Reputation: 26
I noticed that most of tests were performed on functions that convert Bytes array to Hex string. So, in this post I will focus on the other side: functions that convert Hex String To Byte Array. If you are interested in result only, you could skip down to Summary section. The test code file is supplied at the end of the post.
I would like to name the function from the accepted answer (by Tomalak) StringToByteArrayV1, or to shortcut it to V1. rest of functions will be named in same way: V2, V3, V4, ..., etc.
I have tested correctness by passing all 256 possible values of 1 byte, then checking output to see if correct. Result:
note: V5_3 solves this issue (of V5_1 and V5_2)
I have done performance tests using Stopwatch class.
input length: 10,000,000 bytes
runs: 100
average elapsed time per run:
V1 = 136.4ms
V2 = 104.5ms
V3 = 22.0ms
V4 = 9.9ms
V5_1 = 10.2ms
V5_2 = 9.0ms
V5_3 = 9.3ms
V6 = 18.3ms
V7 = 9.8ms
V8 = 8.8ms
V9 = 10.2ms
V10 = 19.0ms
V11 = 12.2ms
V12 = 27.4ms
V13 = 21.8ms
V14 = 12.0ms
V15 = 14.9ms
V16 = 15.3ms
V17 = 9.5ms
V18 got excluded from this test, because it was very slow when using very long string
V19 = 222.8ms
V20 = 66.0ms
V21 = 15.4ms
V1 average ticks per run: 1363529.4
V2 is more fast than V1 by: 1.3 times (ticks ratio)
V3 is more fast than V1 by: 6.2 times (ticks ratio)
V4 is more fast than V1 by: 13.8 times (ticks ratio)
V5_1 is more fast than V1 by: 13.3 times (ticks ratio)
V5_2 is more fast than V1 by: 15.2 times (ticks ratio)
V5_3 is more fast than V1 by: 14.8 times (ticks ratio)
V6 is more fast than V1 by: 7.4 times (ticks ratio)
V7 is more fast than V1 by: 13.9 times (ticks ratio)
V8 is more fast than V1 by: 15.4 times (ticks ratio)
V9 is more fast than V1 by: 13.4 times (ticks ratio)
V10 is more fast than V1 by: 7.2 times (ticks ratio)
V11 is more fast than V1 by: 11.1 times (ticks ratio)
V12 is more fast than V1 by: 5.0 times (ticks ratio)
V13 is more fast than V1 by: 6.3 times (ticks ratio)
V14 is more fast than V1 by: 11.4 times (ticks ratio)
V15 is more fast than V1 by: 9.2 times (ticks ratio)
V16 is more fast than V1 by: 8.9 times (ticks ratio)
V17 is more fast than V1 by: 14.4 times (ticks ratio)
V19 is more SLOW than V1 by: 1.6 times (ticks ratio)
V20 is more fast than V1 by: 2.1 times (ticks ratio)
V21 is more fast than V1 by: 8.9 times (ticks ratio)
V18 took long time at the previous test,
so let's decrease length for it:
input length: 1,000,000 bytes
runs: 100
average elapsed time per run: V1 = 14.1ms , V18 = 146.7ms
V1 average ticks per run: 140630.3
V18 is more SLOW than V1 by: 10.4 times (ticks ratio)
input length: 100 byte
runs: 1,000,000
V1 average ticks per run: 14.6
V2 is more fast than V1 by: 1.4 times (ticks ratio)
V3 is more fast than V1 by: 5.9 times (ticks ratio)
V4 is more fast than V1 by: 15.7 times (ticks ratio)
V5_1 is more fast than V1 by: 15.1 times (ticks ratio)
V5_2 is more fast than V1 by: 18.4 times (ticks ratio)
V5_3 is more fast than V1 by: 16.3 times (ticks ratio)
V6 is more fast than V1 by: 5.3 times (ticks ratio)
V7 is more fast than V1 by: 15.7 times (ticks ratio)
V8 is more fast than V1 by: 18.0 times (ticks ratio)
V9 is more fast than V1 by: 15.5 times (ticks ratio)
V10 is more fast than V1 by: 7.8 times (ticks ratio)
V11 is more fast than V1 by: 12.4 times (ticks ratio)
V12 is more fast than V1 by: 5.3 times (ticks ratio)
V13 is more fast than V1 by: 5.2 times (ticks ratio)
V14 is more fast than V1 by: 13.4 times (ticks ratio)
V15 is more fast than V1 by: 9.9 times (ticks ratio)
V16 is more fast than V1 by: 9.2 times (ticks ratio)
V17 is more fast than V1 by: 16.2 times (ticks ratio)
V18 is more fast than V1 by: 1.1 times (ticks ratio)
V19 is more SLOW than V1 by: 1.6 times (ticks ratio)
V20 is more fast than V1 by: 1.9 times (ticks ratio)
V21 is more fast than V1 by: 11.4 times (ticks ratio)
It is good idea to read Disclaimer section down here in this post, before using any from the following code https://github.com/Ghosticollis/performance-tests/blob/main/MTestPerformance.cs
I recommend using one of the following functions, because of the good performance, and support both upper and lower case:
Here is the final shape of V5_3:
static byte[] HexStringToByteArrayV5_3(string hexString) {
int hexStringLength = hexString.Length;
byte[] b = new byte[hexStringLength / 2];
for (int i = 0; i < hexStringLength; i += 2) {
int topChar = hexString[i];
topChar = (topChar > 0x40 ? (topChar & ~0x20) - 0x37 : topChar - 0x30) << 4;
int bottomChar = hexString[i + 1];
bottomChar = bottomChar > 0x40 ? (bottomChar & ~0x20) - 0x37 : bottomChar - 0x30;
b[i / 2] = (byte)(topChar + bottomChar);
}
return b;
}
WARNING: I don't have proper knowledge in testing. The main purpose of these primitive tests is to give quick overview on what might be good from all of posted functions. If you need accurate results, please use proper testing tools.
Finally, I would like to say I am new to be active at stackoverflow, sorry if my post is lacking. comments to enhance this post would be appreciated.
Upvotes: 6
Reputation: 1539
.NET 5 has added the Convert.ToHexString method.
For those using an older version of .NET
internal static class ByteArrayExtensions
{
public static string ToHexString(this byte[] bytes, Casing casing = Casing.Upper)
{
Span<char> result = stackalloc char[0];
if (bytes.Length > 16)
{
var array = new char[bytes.Length * 2];
result = array.AsSpan();
}
else
{
result = stackalloc char[bytes.Length * 2];
}
int pos = 0;
foreach (byte b in bytes)
{
ToCharsBuffer(b, result, pos, casing);
pos += 2;
}
return result.ToString();
}
private static void ToCharsBuffer(byte value, Span<char> buffer, int startingIndex = 0, Casing casing = Casing.Upper)
{
uint difference = (((uint)value & 0xF0U) << 4) + ((uint)value & 0x0FU) - 0x8989U;
uint packedResult = ((((uint)(-(int)difference) & 0x7070U) >> 4) + difference + 0xB9B9U) | (uint)casing;
buffer[startingIndex + 1] = (char)(packedResult & 0xFF);
buffer[startingIndex] = (char)(packedResult >> 8);
}
}
public enum Casing : uint
{
// Output [ '0' .. '9' ] and [ 'A' .. 'F' ].
Upper = 0,
// Output [ '0' .. '9' ] and [ 'a' .. 'f' ].
Lower = 0x2020U,
}
Adapted from the .NET repository https://github.com/dotnet/runtime/blob/v5.0.3/src/libraries/System.Private.CoreLib/src/System/Convert.cs https://github.com/dotnet/runtime/blob/v5.0.3/src/libraries/Common/src/System/HexConverter.cs
Upvotes: 7
Reputation:
Fastest method for old school people... miss you pointers
static public byte[] HexStrToByteArray(string str)
{
byte[] res = new byte[(str.Length % 2 != 0 ? 0 : str.Length / 2)]; //check and allocate memory
for (int i = 0, j = 0; j < res.Length; i += 2, j++) //convert loop
res[j] = (byte)((str[i] % 32 + 9) % 25 * 16 + (str[i + 1] % 32 + 9) % 25);
return res;
}
Upvotes: 8
Reputation: 663
As of .NET 5 RC2 you can use:
Convert.ToHexString(byte[] inArray)
which returns a string
andConvert.FromHexString(string s)
which returns a byte[]
.Overloads are available that take span parameters.
Upvotes: 43
Reputation: 1069
Combined a few answers into a class for my later copy and paste convenience:
/// <summary>
/// Extension methods to quickly convert byte array to string and back.
/// </summary>
public static class HexConverter
{
/// <summary>
/// Map values to hex digits
/// </summary>
private static readonly char[] HexDigits =
{
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
};
/// <summary>
/// Map 56 characters between ['0', 'F'] to their hex equivalents, and set invalid characters
/// such that they will overflow byte to fail conversion.
/// </summary>
private static readonly ushort[] HexValues =
{
0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100,
0x000A, 0x000B, 0x000C, 0x000D, 0x000E, 0x000F, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100,
0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x0100, 0x000A, 0x000B,
0x000C, 0x000D, 0x000E, 0x000F
};
/// <summary>
/// Empty byte array
/// </summary>
private static readonly byte[] Empty = new byte[0];
/// <summary>
/// Convert a byte array to a hexadecimal string.
/// </summary>
/// <param name="bytes">
/// The input byte array.
/// </param>
/// <returns>
/// A string of hexadecimal digits.
/// </returns>
public static string ToHexString(this byte[] bytes)
{
var c = new char[bytes.Length * 2];
for (int i = 0, j = 0; i < bytes.Length; i++)
{
c[j++] = HexDigits[bytes[i] >> 4];
c[j++] = HexDigits[bytes[i] & 0x0F];
}
return new string(c);
}
/// <summary>
/// Parse a string of hexadecimal digits into a byte array.
/// </summary>
/// <param name="hexadecimalString">
/// The hexadecimal string.
/// </param>
/// <returns>
/// The parsed <see cref="byte[]"/> array.
/// </returns>
/// <exception cref="ArgumentException">
/// The input string either contained invalid characters, or was of an odd length.
/// </exception>
public static byte[] ToByteArray(string hexadecimalString)
{
if (!TryParse(hexadecimalString, out var value))
{
throw new ArgumentException("Invalid hexadecimal string", nameof(hexadecimalString));
}
return value;
}
/// <summary>
/// Parse a hexadecimal string to bytes
/// </summary>
/// <param name="hexadecimalString">
/// The hexadecimal string, which must be an even number of characters.
/// </param>
/// <param name="value">
/// The parsed value if successful.
/// </param>
/// <returns>
/// True if successful.
/// </returns>
public static bool TryParse(string hexadecimalString, out byte[] value)
{
if (hexadecimalString.Length == 0)
{
value = Empty;
return true;
}
if (hexadecimalString.Length % 2 != 0)
{
value = Empty;
return false;
}
try
{
value = new byte[hexadecimalString.Length / 2];
for (int i = 0, j = 0; j < hexadecimalString.Length; i++)
{
value[i] = (byte)((HexValues[hexadecimalString[j++] - '0'] << 4)
| HexValues[hexadecimalString[j++] - '0']);
}
return true;
}
catch (OverflowException)
{
value = Empty;
return false;
}
}
}
Upvotes: 1
Reputation: 9633
With Java 8 , we ca use Byte.toUnsignedInt
public static String convertBytesToHex(byte[] bytes) {
StringBuilder result = new StringBuilder();
for (byte byt : bytes) {
int decimal = Byte.toUnsignedInt(byt);
String hex = Integer.toHexString(decimal);
result.append(hex);
}
return result.toString();
}
Upvotes: -3
Reputation: 445
There is a simple one-liner solution not yet mentioned that will convert hex strings into byte arrays (we don't care about negative interpretation here as it does not matter):
BigInteger.Parse(str, System.Globalization.NumberStyles.HexNumber).ToByteArray().Reverse().ToArray();
Upvotes: 2
Reputation: 9428
From Microsoft's developers, a nice, simple conversion:
public static string ByteArrayToString(byte[] ba)
{
// Concatenate the bytes into one long string
return ba.Aggregate(new StringBuilder(32),
(sb, b) => sb.Append(b.ToString("X2"))
).ToString();
}
While the above is clean and compact, performance junkies will scream about it using enumerators. You can get peak performance with an improved version of Tomalak's original answer:
public static string ByteArrayToString(byte[] ba)
{
StringBuilder hex = new StringBuilder(ba.Length * 2);
for(int i=0; i < ba.Length; i++) // <-- Use for loop is faster than foreach
hex.Append(ba[i].ToString("X2")); // <-- ToString is faster than AppendFormat
return hex.ToString();
}
This is the fastest of all the routines I've seen posted here so far. Don't just take my word for it... performance test each routine and inspect its CIL code for yourself.
Upvotes: 9
Reputation: 5383
Shortest way and .net core supported:
public static string BytesToString(byte[] ba) =>
ba.Aggregate(new StringBuilder(32), (sb, b) => sb.Append(b.ToString("X2"))).ToString();
Upvotes: 3
Reputation: 725
// a safe version of the lookup solution:
public static string ByteArrayToHexViaLookup32Safe(byte[] bytes, bool withZeroX)
{
if (bytes.Length == 0)
{
return withZeroX ? "0x" : "";
}
int length = bytes.Length * 2 + (withZeroX ? 2 : 0);
StateSmall stateToPass = new StateSmall(bytes, withZeroX);
return string.Create(length, stateToPass, (chars, state) =>
{
int offset0x = 0;
if (state.WithZeroX)
{
chars[0] = '0';
chars[1] = 'x';
offset0x += 2;
}
Span<uint> charsAsInts = MemoryMarshal.Cast<char, uint>(chars.Slice(offset0x));
int targetLength = state.Bytes.Length;
for (int i = 0; i < targetLength; i += 1)
{
uint val = Lookup32[state.Bytes[i]];
charsAsInts[i] = val;
}
});
}
private struct StateSmall
{
public StateSmall(byte[] bytes, bool withZeroX)
{
Bytes = bytes;
WithZeroX = withZeroX;
}
public byte[] Bytes;
public bool WithZeroX;
}
Upvotes: 1
Reputation: 5573
I came up with a different code that is tolerant to extra characters (whitespace, dash...). It is mostly inspired from some acceptably-fast answers here. It allows parsing of the following "file"
00-aa-84-fb
12 32 FF CD
12 00
12_32_FF_CD
1200d5e68a
/// <summary>Reads a hex string into bytes</summary>
public static IEnumerable<byte> HexadecimalStringToBytes(string hex) {
if (hex == null)
throw new ArgumentNullException(nameof(hex));
char c, c1 = default(char);
bool hasc1 = false;
unchecked {
for (int i = 0; i < hex.Length; i++) {
c = hex[i];
bool isValid = 'A' <= c && c <= 'f' || 'a' <= c && c <= 'f' || '0' <= c && c <= '9';
if (!hasc1) {
if (isValid) {
hasc1 = true;
}
} else {
hasc1 = false;
if (isValid) {
yield return (byte)((GetHexVal(c1) << 4) + GetHexVal(c));
}
}
c1 = c;
}
}
}
/// <summary>Reads a hex string into a byte array</summary>
public static byte[] HexadecimalStringToByteArray(string hex)
{
if (hex == null)
throw new ArgumentNullException(nameof(hex));
var bytes = new List<byte>(hex.Length / 2);
foreach (var item in HexadecimalStringToBytes(hex)) {
bytes.Add(item);
}
return bytes.ToArray();
}
private static byte GetHexVal(char val)
{
return (byte)(val - (val < 0x3A ? 0x30 : val < 0x5B ? 0x37 : 0x57));
// ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ ^^^^
// digits 0-9 upper char A-Z a-z
}
Please refer to full code when copying. Unit tests included.
Some might say it is too much tolerant to extra chars. So don't rely on this code to perform validation (or change it).
Upvotes: 1
Reputation: 26565
Note: new leader as of 2015-08-20.
I ran each of the various conversion methods through some crude Stopwatch
performance testing, a run with a random sentence (n=61, 1000 iterations) and a run with a Project Gutenburg text (n=1,238,957, 150 iterations). Here are the results, roughly from fastest to slowest. All measurements are in ticks (10,000 ticks = 1 ms) and all relative notes are compared to the [slowest] StringBuilder
implementation. For the code used, see below or the test framework repo where I now maintain the code for running this.
WARNING: Do not rely on these stats for anything concrete; they are simply a sample run of sample data. If you really need top-notch performance, please test these methods in an environment representative of your production needs with data representative of what you will use.
unsafe
(via CodesInChaos) (added to test repo by airbreather)
BitConverter
(via Tomalak)
{SoapHexBinary}.ToString
(via Mykroft)
{byte}.ToString("X2")
(using foreach
) (derived from Will Dean's answer)
{byte}.ToString("X2")
(using {IEnumerable}.Aggregate
, requires System.Linq) (via Mark)
Array.ConvertAll
(using string.Join
) (via Will Dean)
Array.ConvertAll
(using string.Concat
, requires .NET 4.0) (via Will Dean)
{StringBuilder}.AppendFormat
(using foreach
) (via Tomalak)
{StringBuilder}.AppendFormat
(using {IEnumerable}.Aggregate
, requires System.Linq) (derived from Tomalak's answer)
Lookup tables have taken the lead over byte manipulation. Basically, there is some form of precomputing what any given nibble or byte will be in hex. Then, as you rip through the data, you simply look up the next portion to see what hex string it would be. That value is then added to the resulting string output in some fashion. For a long time byte manipulation, potentially harder to read by some developers, was the top-performing approach.
Your best bet is still going to be finding some representative data and trying it out in a production-like environment. If you have different memory constraints, you may prefer a method with fewer allocations to one that would be faster but consume more memory.
Feel free to play with the testing code I used. A version is included here but feel free to clone the repo and add your own methods. Please submit a pull request if you find anything interesting or want to help improve the testing framework it uses.
Func<byte[], string>
) to /Tests/ConvertByteArrayToHexString/Test.cs.TestCandidates
return value in that same class.GenerateTestInput
in that same class.static string ByteArrayToHexStringViaStringJoinArrayConvertAll(byte[] bytes) {
return string.Join(string.Empty, Array.ConvertAll(bytes, b => b.ToString("X2")));
}
static string ByteArrayToHexStringViaStringConcatArrayConvertAll(byte[] bytes) {
return string.Concat(Array.ConvertAll(bytes, b => b.ToString("X2")));
}
static string ByteArrayToHexStringViaBitConverter(byte[] bytes) {
string hex = BitConverter.ToString(bytes);
return hex.Replace("-", "");
}
static string ByteArrayToHexStringViaStringBuilderAggregateByteToString(byte[] bytes) {
return bytes.Aggregate(new StringBuilder(bytes.Length * 2), (sb, b) => sb.Append(b.ToString("X2"))).ToString();
}
static string ByteArrayToHexStringViaStringBuilderForEachByteToString(byte[] bytes) {
StringBuilder hex = new StringBuilder(bytes.Length * 2);
foreach (byte b in bytes)
hex.Append(b.ToString("X2"));
return hex.ToString();
}
static string ByteArrayToHexStringViaStringBuilderAggregateAppendFormat(byte[] bytes) {
return bytes.Aggregate(new StringBuilder(bytes.Length * 2), (sb, b) => sb.AppendFormat("{0:X2}", b)).ToString();
}
static string ByteArrayToHexStringViaStringBuilderForEachAppendFormat(byte[] bytes) {
StringBuilder hex = new StringBuilder(bytes.Length * 2);
foreach (byte b in bytes)
hex.AppendFormat("{0:X2}", b);
return hex.ToString();
}
static string ByteArrayToHexViaByteManipulation(byte[] bytes) {
char[] c = new char[bytes.Length * 2];
byte b;
for (int i = 0; i < bytes.Length; i++) {
b = ((byte)(bytes[i] >> 4));
c[i * 2] = (char)(b > 9 ? b + 0x37 : b + 0x30);
b = ((byte)(bytes[i] & 0xF));
c[i * 2 + 1] = (char)(b > 9 ? b + 0x37 : b + 0x30);
}
return new string(c);
}
static string ByteArrayToHexViaByteManipulation2(byte[] bytes) {
char[] c = new char[bytes.Length * 2];
int b;
for (int i = 0; i < bytes.Length; i++) {
b = bytes[i] >> 4;
c[i * 2] = (char)(55 + b + (((b - 10) >> 31) & -7));
b = bytes[i] & 0xF;
c[i * 2 + 1] = (char)(55 + b + (((b - 10) >> 31) & -7));
}
return new string(c);
}
static string ByteArrayToHexViaSoapHexBinary(byte[] bytes) {
SoapHexBinary soapHexBinary = new SoapHexBinary(bytes);
return soapHexBinary.ToString();
}
static string ByteArrayToHexViaLookupAndShift(byte[] bytes) {
StringBuilder result = new StringBuilder(bytes.Length * 2);
string hexAlphabet = "0123456789ABCDEF";
foreach (byte b in bytes) {
result.Append(hexAlphabet[(int)(b >> 4)]);
result.Append(hexAlphabet[(int)(b & 0xF)]);
}
return result.ToString();
}
static readonly uint* _lookup32UnsafeP = (uint*)GCHandle.Alloc(_Lookup32, GCHandleType.Pinned).AddrOfPinnedObject();
static string ByteArrayToHexViaLookup32UnsafeDirect(byte[] bytes) {
var lookupP = _lookup32UnsafeP;
var result = new string((char)0, bytes.Length * 2);
fixed (byte* bytesP = bytes)
fixed (char* resultP = result) {
uint* resultP2 = (uint*)resultP;
for (int i = 0; i < bytes.Length; i++) {
resultP2[i] = lookupP[bytesP[i]];
}
}
return result;
}
static uint[] _Lookup32 = Enumerable.Range(0, 255).Select(i => {
string s = i.ToString("X2");
return ((uint)s[0]) + ((uint)s[1] << 16);
}).ToArray();
static string ByteArrayToHexViaLookupPerByte(byte[] bytes) {
var result = new char[bytes.Length * 2];
for (int i = 0; i < bytes.Length; i++)
{
var val = _Lookup32[bytes[i]];
result[2*i] = (char)val;
result[2*i + 1] = (char) (val >> 16);
}
return new string(result);
}
static string ByteArrayToHexViaLookup(byte[] bytes) {
string[] hexStringTable = new string[] {
"00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "0A", "0B", "0C", "0D", "0E", "0F",
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "1A", "1B", "1C", "1D", "1E", "1F",
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "2A", "2B", "2C", "2D", "2E", "2F",
"30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "3A", "3B", "3C", "3D", "3E", "3F",
"40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "4A", "4B", "4C", "4D", "4E", "4F",
"50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "5A", "5B", "5C", "5D", "5E", "5F",
"60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "6A", "6B", "6C", "6D", "6E", "6F",
"70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "7A", "7B", "7C", "7D", "7E", "7F",
"80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "8A", "8B", "8C", "8D", "8E", "8F",
"90", "91", "92", "93", "94", "95", "96", "97", "98", "99", "9A", "9B", "9C", "9D", "9E", "9F",
"A0", "A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "AA", "AB", "AC", "AD", "AE", "AF",
"B0", "B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "BA", "BB", "BC", "BD", "BE", "BF",
"C0", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "CA", "CB", "CC", "CD", "CE", "CF",
"D0", "D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "DA", "DB", "DC", "DD", "DE", "DF",
"E0", "E1", "E2", "E3", "E4", "E5", "E6", "E7", "E8", "E9", "EA", "EB", "EC", "ED", "EE", "EF",
"F0", "F1", "F2", "F3", "F4", "F5", "F6", "F7", "F8", "F9", "FA", "FB", "FC", "FD", "FE", "FF",
};
StringBuilder result = new StringBuilder(bytes.Length * 2);
foreach (byte b in bytes) {
result.Append(hexStringTable[b]);
}
return result.ToString();
}
Added Waleed's answer to analysis. Quite fast.
Added string.Concat
Array.ConvertAll
variant for completeness (requires .NET 4.0). On par with string.Join
version.
Test repo includes more variants such as StringBuilder.Append(b.ToString("X2"))
. None upset the results any. foreach
is faster than {IEnumerable}.Aggregate
, for instance, but BitConverter
still wins.
Added Mykroft's SoapHexBinary
answer to analysis, which took over third place.
Added CodesInChaos's byte manipulation answer, which took over first place (by a large margin on large blocks of text).
Added Nathan Moinvaziri's lookup answer and the variant from Brian Lambert's blog. Both rather fast, but not taking the lead on the test machine I used (AMD Phenom 9750).
Added @CodesInChaos's new byte-based lookup answer. It appears to have taken the lead on both the sentence tests and the full-text tests.
Added airbreather's optimizations and unsafe
variant to this answer's repo. If you want to play in the unsafe game, you can get some huge performance gains over any of the prior top winners on both short strings and large texts.
Upvotes: 558
Reputation: 773
There is also XmlWriter.WriteBinHex
(see the MSDN page). This is very useful if you need to put the hexadecimal string into an XML stream.
Here is a standalone method to see how it works:
public static string ToBinHex(byte[] bytes)
{
XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.ConformanceLevel = ConformanceLevel.Fragment;
xmlWriterSettings.CheckCharacters = false;
xmlWriterSettings.Encoding = ASCIIEncoding.ASCII;
MemoryStream memoryStream = new MemoryStream();
using (XmlWriter xmlWriter = XmlWriter.Create(memoryStream, xmlWriterSettings))
{
xmlWriter.WriteBinHex(bytes, 0, bytes.Length);
}
return Encoding.ASCII.GetString(memoryStream.ToArray());
}
Upvotes: 1
Reputation: 7261
This is an answer to revision 4 of Tomalak's highly popular answer (and subsequent edits).
I'll make the case that this edit is wrong, and explain why it could be reverted. Along the way, you might learn a thing or two about some internals, and see yet another example of what premature optimization really is and how it can bite you.
tl;dr: Just use Convert.ToByte
and String.Substring
if you're in a hurry ("Original code" below), it's the best combination if you don't want to re-implement Convert.ToByte
. Use something more advanced (see other answers) that doesn't use Convert.ToByte
if you need performance. Do not use anything else other than String.Substring
in combination with Convert.ToByte
, unless someone has something interesting to say about this in the comments of this answer.
warning: This answer may become obsolete if a Convert.ToByte(char[], Int32)
overload is implemented in the framework. This is unlikely to happen soon.
As a general rule, I don't much like to say "don't optimize prematurely", because nobody knows when "premature" is. The only thing you must consider when deciding whether to optimize or not is: "Do I have the time and resources to investigate optimization approaches properly?". If you don't, then it's too soon, wait until your project is more mature or until you need the performance (if there is a real need, then you will make the time). In the meantime, do the simplest thing that could possibly work instead.
Original code:
public static byte[] HexadecimalStringToByteArray_Original(string input)
{
var outputLength = input.Length / 2;
var output = new byte[outputLength];
for (var i = 0; i < outputLength; i++)
output[i] = Convert.ToByte(input.Substring(i * 2, 2), 16);
return output;
}
Revision 4:
public static byte[] HexadecimalStringToByteArray_Rev4(string input)
{
var outputLength = input.Length / 2;
var output = new byte[outputLength];
using (var sr = new StringReader(input))
{
for (var i = 0; i < outputLength; i++)
output[i] = Convert.ToByte(new string(new char[2] { (char)sr.Read(), (char)sr.Read() }), 16);
}
return output;
}
The revision avoids String.Substring
and uses a StringReader
instead. The given reason is:
Edit: you can improve performance for long strings by using a single pass parser, like so:
Well, looking at the reference code for String.Substring
, it's clearly "single-pass" already; and why shouldn't it be? It operates at byte-level, not on surrogate pairs.
It does allocate a new string however, but then you need to allocate one to pass to Convert.ToByte
anyway. Furthermore, the solution provided in the revision allocates yet another object on every iteration (the two-char array); you can safely put that allocation outside the loop and reuse the array to avoid that.
public static byte[] HexadecimalStringToByteArray(string input)
{
var outputLength = input.Length / 2;
var output = new byte[outputLength];
var numeral = new char[2];
using (var sr = new StringReader(input))
{
for (var i = 0; i < outputLength; i++)
{
numeral[0] = (char)sr.Read();
numeral[1] = (char)sr.Read();
output[i] = Convert.ToByte(new string(numeral), 16);
}
}
return output;
}
Each hexadecimal numeral
represents a single octet using two digits (symbols).
But then, why call StringReader.Read
twice? Just call its second overload and ask it to read two characters in the two-char array at once; and reduce the amount of calls by two.
public static byte[] HexadecimalStringToByteArray(string input)
{
var outputLength = input.Length / 2;
var output = new byte[outputLength];
var numeral = new char[2];
using (var sr = new StringReader(input))
{
for (var i = 0; i < outputLength; i++)
{
var read = sr.Read(numeral, 0, 2);
Debug.Assert(read == 2);
output[i] = Convert.ToByte(new string(numeral), 16);
}
}
return output;
}
What you're left with is a string reader whose only added "value" is a parallel index (internal _pos
) which you could have declared yourself (as j
for example), a redundant length variable (internal _length
), and a redundant reference to the input string (internal _s
). In other words, it's useless.
If you wonder how Read
"reads", just look at the code, all it does is call String.CopyTo
on the input string. The rest is just book-keeping overhead to maintain values we don't need.
So, remove the string reader already, and call CopyTo
yourself; it's simpler, clearer, and more efficient.
public static byte[] HexadecimalStringToByteArray(string input)
{
var outputLength = input.Length / 2;
var output = new byte[outputLength];
var numeral = new char[2];
for (int i = 0, j = 0; i < outputLength; i++, j += 2)
{
input.CopyTo(j, numeral, 0, 2);
output[i] = Convert.ToByte(new string(numeral), 16);
}
return output;
}
Do you really need a j
index that increments in steps of two parallel to i
? Of course not, just multiply i
by two (which the compiler should be able to optimize to an addition).
public static byte[] HexadecimalStringToByteArray_BestEffort(string input)
{
var outputLength = input.Length / 2;
var output = new byte[outputLength];
var numeral = new char[2];
for (int i = 0; i < outputLength; i++)
{
input.CopyTo(i * 2, numeral, 0, 2);
output[i] = Convert.ToByte(new string(numeral), 16);
}
return output;
}
What does the solution look like now? Exactly like it was at the beginning, only instead of using String.Substring
to allocate the string and copy the data to it, you're using an intermediary array to which you copy the hexadecimal numerals to, then allocate the string yourself and copy the data again from the array and into the string (when you pass it in the string constructor). The second copy might be optimized-out if the string is already in the intern pool, but then String.Substring
will also be able to avoid it in these cases.
In fact, if you look at String.Substring
again, you see that it uses some low-level internal knowledge of how strings are constructed to allocate the string faster than you could normally do it, and it inlines the same code used by CopyTo
directly in there to avoid the call overhead.
String.Substring
Manual method
Conclusion? If you want to use Convert.ToByte(String, Int32)
(because you don't want to re-implement that functionality yourself), there doesn't seem to be a way to beat String.Substring
; all you do is run in circles, re-inventing the wheel (only with sub-optimal materials).
Note that using Convert.ToByte
and String.Substring
is a perfectly valid choice if you don't need extreme performance. Remember: only opt for an alternative if you have the time and resources to investigate how it works properly.
If there was a Convert.ToByte(char[], Int32)
, things would be different of course (it would be possible to do what I described above and completely avoid String
).
I suspect that people who report better performance by "avoiding String.Substring
" also avoid Convert.ToByte(String, Int32)
, which you should really be doing if you need the performance anyway. Look at the countless other answers to discover all the different approaches to do that.
Disclaimer: I haven't decompiled the latest version of the framework to verify that the reference source is up-to-date, I assume it is.
Now, it all sounds good and logical, hopefully even obvious if you've managed to get so far. But is it true?
Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
Cores: 8
Current Clock Speed: 2600
Max Clock Speed: 2600
--------------------
Parsing hexadecimal string into an array of bytes
--------------------
HexadecimalStringToByteArray_Original: 7,777.09 average ticks (over 10000 runs), 1.2X
HexadecimalStringToByteArray_BestEffort: 8,550.82 average ticks (over 10000 runs), 1.1X
HexadecimalStringToByteArray_Rev4: 9,218.03 average ticks (over 10000 runs), 1.0X
Yes!
Props to Partridge for the bench framework, it's easy to hack. The input used is the following SHA-1 hash repeated 5000 times to make a 100,000 bytes long string.
209113288F93A9AB8E474EA78D899AFDBB874355
Have fun! (But optimize with moderation.)
Upvotes: 26
Reputation: 13435
There's a class called SoapHexBinary that does exactly what you want.
using System.Runtime.Remoting.Metadata.W3cXsd2001;
public static byte[] GetStringToBytes(string value)
{
SoapHexBinary shb = SoapHexBinary.Parse(value);
return shb.Value;
}
public static string GetBytesToString(byte[] value)
{
SoapHexBinary shb = new SoapHexBinary(value);
return shb.ToString();
}
Upvotes: 263
Reputation: 5137
Basic Solution With Extension Support
public static class Utils
{
public static byte[] ToBin(this string hex)
{
int NumberChars = hex.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
public static string ToHex(this byte[] ba)
{
return BitConverter.ToString(ba).Replace("-", "");
}
}
And use this class like below
byte[] arr1 = new byte[] { 1, 2, 3 };
string hex1 = arr1.ToHex();
byte[] arr2 = hex1.ToBin();
Upvotes: 0
Reputation: 317
This problem could also be solved using a look-up table. This would require a small amount of static memory for both the encoder and decoder. This method will however be fast:
My solution uses 1024 bytes for the encoding table, and 256 bytes for decoding.
private static readonly byte[] LookupTable = new byte[] {
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};
private static byte Lookup(char c)
{
var b = LookupTable[c];
if (b == 255)
throw new IOException("Expected a hex character, got " + c);
return b;
}
public static byte ToByte(char[] chars, int offset)
{
return (byte)(Lookup(chars[offset]) << 4 | Lookup(chars[offset + 1]));
}
private static readonly char[][] LookupTableUpper;
private static readonly char[][] LookupTableLower;
static Hex()
{
LookupTableLower = new char[256][];
LookupTableUpper = new char[256][];
for (var i = 0; i < 256; i++)
{
LookupTableLower[i] = i.ToString("x2").ToCharArray();
LookupTableUpper[i] = i.ToString("X2").ToCharArray();
}
}
public static char[] ToCharLower(byte[] b, int bOffset)
{
return LookupTableLower[b[bOffset]];
}
public static char[] ToCharUpper(byte[] b, int bOffset)
{
return LookupTableUpper[b[bOffset]];
}
StringBuilderToStringFromBytes: 106148
BitConverterToStringFromBytes: 15783
ArrayConvertAllToStringFromBytes: 54290
ByteManipulationToCharArray: 8444
TableBasedToCharArray: 5651 *
* this solution
During decoding IOException and IndexOutOfRangeException could occur (if a character has a too high value > 256). Methods for de/encoding streams or arrays should be implemented, this is just a proof of concept.
Upvotes: 20
Reputation: 19
For performance I would go with drphrozens solution. A tiny optimization for the decoder could be to use a table for either char to get rid of the "<< 4".
Clearly the two method calls are costly. If some kind of check is made either on input or output data (could be CRC, checksum or whatever) the if (b == 255)...
could be skipped and thereby also the method calls altogether.
Using offset++
and offset
instead of offset
and offset + 1
might give some theoretical benefit but I suspect the compiler handles this better than me.
private static readonly byte[] LookupTableLow = new byte[] {
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};
private static readonly byte[] LookupTableHigh = new byte[] {
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0x00, 0x10, 0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80, 0x90, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xA0, 0xB0, 0xC0, 0xD0, 0xE0, 0xF0, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xA0, 0xB0, 0xC0, 0xD0, 0xE0, 0xF0, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};
private static byte LookupLow(char c)
{
var b = LookupTableLow[c];
if (b == 255)
throw new IOException("Expected a hex character, got " + c);
return b;
}
private static byte LookupHigh(char c)
{
var b = LookupTableHigh[c];
if (b == 255)
throw new IOException("Expected a hex character, got " + c);
return b;
}
public static byte ToByte(char[] chars, int offset)
{
return (byte)(LookupHigh(chars[offset++]) | LookupLow(chars[offset]));
}
This is just off the top of my head and has not been tested or benchmarked.
Upvotes: 4
Reputation: 4131
Another fast function...
private static readonly byte[] HexNibble = new byte[] {
0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7,
0x8, 0x9, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0xA, 0xB, 0xC, 0xD, 0xE, 0xF, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0xA, 0xB, 0xC, 0xD, 0xE, 0xF
};
public static byte[] HexStringToByteArray( string str )
{
int byteCount = str.Length >> 1;
byte[] result = new byte[byteCount + (str.Length & 1)];
for( int i = 0; i < byteCount; i++ )
result[i] = (byte) (HexNibble[str[i << 1] - 48] << 4 | HexNibble[str[(i << 1) + 1] - 48]);
if( (str.Length & 1) != 0 )
result[byteCount] = (byte) HexNibble[str[str.Length - 1] - 48];
return result;
}
Upvotes: 3
Reputation: 49
Why make it complex? This is simple in Visual Studio 2008:
C#:
string hex = BitConverter.ToString(YourByteArray).Replace("-", "");
VB:
Dim hex As String = BitConverter.ToString(YourByteArray).Replace("-", "")
Upvotes: 14
Reputation: 13381
Not to pile on to the many answers here, but I found a fairly optimal (~4.5x better than accepted), straightforward implementation of the hex string parser. First, output from my tests (the first batch is my implementation):
Give me that string:
04c63f7842740c77e545bb0b2ade90b384f119f6ab57b680b7aa575a2f40939f
Time to parse 100,000 times: 50.4192 ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F
Accepted answer: (StringToByteArray)
Time to parse 100000 times: 233.1264ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F
With Mono's implementation:
Time to parse 100000 times: 777.2544ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F
With SoapHexBinary:
Time to parse 100000 times: 845.1456ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F
The base64 and 'BitConverter'd' lines are there to test for correctness. Note that they are equal.
The implementation:
public static byte[] ToByteArrayFromHex(string hexString)
{
if (hexString.Length % 2 != 0) throw new ArgumentException("String must have an even length");
var array = new byte[hexString.Length / 2];
for (int i = 0; i < hexString.Length; i += 2)
{
array[i/2] = ByteFromTwoChars(hexString[i], hexString[i + 1]);
}
return array;
}
private static byte ByteFromTwoChars(char p, char p_2)
{
byte ret;
if (p <= '9' && p >= '0')
{
ret = (byte) ((p - '0') << 4);
}
else if (p <= 'f' && p >= 'a')
{
ret = (byte) ((p - 'a' + 10) << 4);
}
else if (p <= 'F' && p >= 'A')
{
ret = (byte) ((p - 'A' + 10) << 4);
} else throw new ArgumentException("Char is not a hex digit: " + p,"p");
if (p_2 <= '9' && p_2 >= '0')
{
ret |= (byte) ((p_2 - '0'));
}
else if (p_2 <= 'f' && p_2 >= 'a')
{
ret |= (byte) ((p_2 - 'a' + 10));
}
else if (p_2 <= 'F' && p_2 >= 'A')
{
ret |= (byte) ((p_2 - 'A' + 10));
} else throw new ArgumentException("Char is not a hex digit: " + p_2, "p_2");
return ret;
}
I tried some stuff with unsafe
and moving the (clearly redundant) character-to-nibble if
sequence to another method, but this was the fastest it got.
(I concede that this answers half the question. I felt that the string->byte[] conversion was underrepresented, while the byte[]->string angle seems to be well covered. Thus, this answer.)
Upvotes: 9
Reputation: 2533
Complement to answer by @CodesInChaos (reversed method)
public static byte[] HexToByteUsingByteManipulation(string s)
{
byte[] bytes = new byte[s.Length / 2];
for (int i = 0; i < bytes.Length; i++)
{
int hi = s[i*2] - 65;
hi = hi + 10 + ((hi >> 31) & 7);
int lo = s[i*2 + 1] - 65;
lo = lo + 10 + ((lo >> 31) & 7) & 0x0f;
bytes[i] = (byte) (lo | hi << 4);
}
return bytes;
}
Explanation:
& 0x0f
is to support also lower case letters
hi = hi + 10 + ((hi >> 31) & 7);
is the same as:
hi = ch-65 + 10 + (((ch-65) >> 31) & 7);
For '0'..'9' it is the same as hi = ch - 65 + 10 + 7;
which is hi = ch - 48
(this is because of 0xffffffff & 7
).
For 'A'..'F' it is hi = ch - 65 + 10;
(this is because of 0x00000000 & 7
).
For 'a'..'f' we have to big numbers so we must subtract 32 from default version by making some bits 0
by using & 0x0f
.
65 is code for 'A'
48 is code for '0'
7 is the number of letters between '9'
and 'A'
in the ASCII table (...456789:;<=>?@ABCD...
).
Upvotes: 22
Reputation: 13030
Here's my shot at it. I've created a pair of extension classes to extend string and byte. On the large file test, the performance is comparable to Byte Manipulation 2.
The code below for ToHexString is an optimized implementation of the lookup and shift algorithm. It is almost identical to the one by Behrooz, but it turns out using a foreach
to iterate and a counter is faster than an explicitly indexing for
.
It comes in 2nd place behind Byte Manipulation 2 on my machine and is very readable code. The following test results are also of interest:
ToHexStringCharArrayWithCharArrayLookup: 41,589.69 average ticks (over 1000 runs), 1.5X ToHexStringCharArrayWithStringLookup: 50,764.06 average ticks (over 1000 runs), 1.2X ToHexStringStringBuilderWithCharArrayLookup: 62,812.87 average ticks (over 1000 runs), 1.0X
Based on the above results it seems safe to conclude that:
Here's the code:
using System;
namespace ConversionExtensions
{
public static class ByteArrayExtensions
{
private readonly static char[] digits = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };
public static string ToHexString(this byte[] bytes)
{
char[] hex = new char[bytes.Length * 2];
int index = 0;
foreach (byte b in bytes)
{
hex[index++] = digits[b >> 4];
hex[index++] = digits[b & 0x0F];
}
return new string(hex);
}
}
}
using System;
using System.IO;
namespace ConversionExtensions
{
public static class StringExtensions
{
public static byte[] ToBytes(this string hexString)
{
if (!string.IsNullOrEmpty(hexString) && hexString.Length % 2 != 0)
{
throw new FormatException("Hexadecimal string must not be empty and must contain an even number of digits to be valid.");
}
hexString = hexString.ToUpperInvariant();
byte[] data = new byte[hexString.Length / 2];
for (int index = 0; index < hexString.Length; index += 2)
{
int highDigitValue = hexString[index] <= '9' ? hexString[index] - '0' : hexString[index] - 'A' + 10;
int lowDigitValue = hexString[index + 1] <= '9' ? hexString[index + 1] - '0' : hexString[index + 1] - 'A' + 10;
if (highDigitValue < 0 || lowDigitValue < 0 || highDigitValue > 15 || lowDigitValue > 15)
{
throw new FormatException("An invalid digit was encountered. Valid hexadecimal digits are 0-9 and A-F.");
}
else
{
byte value = (byte)((highDigitValue << 4) | (lowDigitValue & 0x0F));
data[index / 2] = value;
}
}
return data;
}
}
}
Below are the test results that I got when I put my code in @patridge's testing project on my machine. I also added a test for converting to a byte array from hexadecimal. The test runs that exercised my code are ByteArrayToHexViaOptimizedLookupAndShift and HexToByteArrayViaByteManipulation. The HexToByteArrayViaConvertToByte was taken from XXXX. The HexToByteArrayViaSoapHexBinary is the one from @Mykroft's answer.
Intel Pentium III Xeon processor
Cores: 4 <br/> Current Clock Speed: 1576 <br/> Max Clock Speed: 3092 <br/>
Converting array of bytes into hexadecimal string representation
ByteArrayToHexViaByteManipulation2: 39,366.64 average ticks (over 1000 runs), 22.4X
ByteArrayToHexViaOptimizedLookupAndShift: 41,588.64 average ticks (over 1000 runs), 21.2X
ByteArrayToHexViaLookup: 55,509.56 average ticks (over 1000 runs), 15.9X
ByteArrayToHexViaByteManipulation: 65,349.12 average ticks (over 1000 runs), 13.5X
ByteArrayToHexViaLookupAndShift: 86,926.87 average ticks (over 1000 runs), 10.2X
ByteArrayToHexStringViaBitConverter: 139,353.73 average ticks (over 1000 runs),6.3X
ByteArrayToHexViaSoapHexBinary: 314,598.77 average ticks (over 1000 runs), 2.8X
ByteArrayToHexStringViaStringBuilderForEachByteToString: 344,264.63 average ticks (over 1000 runs), 2.6X
ByteArrayToHexStringViaStringBuilderAggregateByteToString: 382,623.44 average ticks (over 1000 runs), 2.3X
ByteArrayToHexStringViaStringBuilderForEachAppendFormat: 818,111.95 average ticks (over 1000 runs), 1.1X
ByteArrayToHexStringViaStringConcatArrayConvertAll: 839,244.84 average ticks (over 1000 runs), 1.1X
ByteArrayToHexStringViaStringBuilderAggregateAppendFormat: 867,303.98 average ticks (over 1000 runs), 1.0X
ByteArrayToHexStringViaStringJoinArrayConvertAll: 882,710.28 average ticks (over 1000 runs), 1.0X
Upvotes: 3