Dhamo R
Dhamo R

Reputation: 110

Raw String to normal string javascript

Normal String assignment:

var str1 = "\320";
console.log(str1);    //   "Ð"

Raw String assignment:

var str2 = String.raw`\320`;
console.log(str2);    //   "\320"

In raw string, the backslashes are not interpreted. I need to interpret them so that "\320" will become "Ð". Should I have to convert the raw string to normal String. If so, How? If not so, what else should I do and how do I do?

Upvotes: 2

Views: 4270

Answers (2)

Audun Olsen
Audun Olsen

Reputation: 638

Question is a couple of months old, but I think this answer is your best bet, yet. Transforming escape sequences from raw strings is very much doable with ES6 String.fromcodepoint(<hex-value>). I'm in the middle of writing an NPM package which deals with this exact scenario.

First, you need a regular expression which matches all escape sequences in your string. I've used this as a reference for all the different ones. (I use a raw string for this to avoid spamming backslashes)

let [single, ...hex] = String.raw`
  \\[bfnrtv0'"\\]
  \\x[a-fA-F0-9]{2}
 (\\u[a-fA-F0-9]{4}){1,}
  \\u\{([0-9a-fA-F]{1,})\}`
  .split("\n").slice(1).map(cur => cur.trim());

let escapes = new RegExp(`(${[single].concat(hex).join("|")})`, "gm"),

    // We need these for later when differentiating how we convert the different escapes.
    uniES6  = new RegExp(`${hex.pop()}`);
    single  = new RegExp(`${single}`);

Now you can match all the escapes; reserved single characters, extended ASCII range, ES6 "Astral" unicode hexadecimals and surrogate pairs. (except octals because they're deprecated, but you can always add it back). The next step is writing a function which can replace the code points with the corresponding symbols. First a switch-like function for singles:

const singleEscape = seq =>
  (() => ({
    "\\b"  : "\b",
    "\\f"  : "\f",
    "\\n"  : "\n",
    "\\r"  : "\r",
    "\\t"  : "\t",
    "\\v"  : "\v",
    "\\0"  : "\0",
    "\\'"  : "\'",
    "\\\"" : "\"",
    "\\\\" : "\\"
  }[seq]))();

Then we can rely on ES6 fromcodepoint to deal with the rest which are all hexadecimals.

const convertEscape = seq => {

  if (single.test(seq)) 
    return singleEscape(seq);

  else if (uniES6.test(seq))
    return String.fromCodePoint(`0x${seq.split("").slice(3, -1).join("")}`);

  else
    return String.fromCodePoint.apply(
      String, seq.split("\\").slice(1).map(pt => `0x${pt.substr(1)}`)
    );

}

Lastly, we tie it all together with a tagged template literal function named normal. I do not know why you need a raw string, but here you can have access to the raw string and put any additional logic while still resulting in a string where escape sequences are properly parsed.

const normal = (strings, ...values) => strings.raw
    .reduce((acc, cur, i) => acc += (values[i-1] || "") + cur, "")
    .replace(escapes, match => convertEscape(match));

Upvotes: 1

Ronn Wilder
Ronn Wilder

Reputation: 1368

The thing is, this code is octal, and since these are mapped with linguistic symbols, javascript interpretes it when defining new string. what you can do is make a map of all the symbols you require with their key as actual string and value as actual symbol.

for ex -

var map = {
    "\\320": "\320"
}

console.log(map);

now you can search you text in the map and get the required value.

 var str2 = String.raw`\320`;
 var s = map[str2];
 console.log(s);

to make the map, try this - visit this site - https://brajeshwar.github.io/entities/

and run this code on console

 // for latin
 var tbody = document.getElementById("latin");
 var trs = tbody.children;
 var map = {};
 for(i=1;i<trs.length;i++) {
    console.log(trs[i].children[6].innerText);
    key = trs[i].children[6].innerText;
    value = trs[i].children[1].innerText;
    map[key] = value;
 }

now console map, stringify it, and paste the string in your code and parse it. I have done this only for latin, similarly do this for other elements also.

Upvotes: 4

Related Questions