Fernando Correia
Fernando Correia

Reputation: 22355

Decode a string in hexadecimal representation

In R, what is an efficient way to convert a string encoded in hexadecimal, such as "40414243" to its equivalent characters e.g. "@ABC"?

For instance, the equivalent of this code:

library(stringr)

FromHexString <- function (hex.string) {
  result <- ""
  length <- str_length(hex.string)
  for (i in seq(1, length, by=2)) {
    hex.value <- str_sub(hex.string, i, i + 1)
    char.code <- strtoi(hex.value, 16)
    char <- rawToChar(as.raw(char.code))
    result <- paste(result, char, sep="")
    char
  }
  result
}

Which produces:

> FromHexString("40414243")
[1] "@ABC"

While the above code works, it's not efficient at all, using a lot of string concatenations.

So the question is how to write an idiomatic, efficient R function that does this operation.

Edit: My sample works only for ASCII encoding, not for UTF-8 encoded byte arrays.

Upvotes: 3

Views: 202

Answers (3)

Roland
Roland

Reputation: 132576

Test if that is more efficient (for longer strings):

string <- "40414243"

intToUtf8(
  strtoi(
    do.call(
      paste0, 
      as.data.frame(
        matrix(
          strsplit(string, split = "")[[1]], 
          ncol=2, 
          byrow=TRUE), 
        stringsAsFactors=FALSE)), 
    base=16L)
)
#[1] "@ABC"

Otherwise you could look for a C/C++ implementation.

Upvotes: 4

Carl Witthoft
Carl Witthoft

Reputation: 21492

If you don't want to use a lookup table (or just like codegolfing :-) ) , consider writing a vectorized version of something like:

bar <- unlist(strsplit(foo,'')) #separates input into individual elements
items <- sapply(1:(length(bar)/2),function(j)paste0(bar[(2*j-1):(2*j)],sep='',collapse=''))

followed with strtoi or whatever.

But even easier (I hope...) is

sapply(1:(nchar(foo)/2) function(j) substr(foo,(2*j-1),(2*j)))

Upvotes: 1

Atilla Ozgur
Atilla Ozgur

Reputation: 14701

Modify your code so that it uses lookup tables an example for R here. Your lookup table will have 255 values. Put them in vector and get their values from that vector.

Not: No other solution will beat this one if you need to do a lot of conversions.

Upvotes: 1

Related Questions