Reputation: 4395

How to detect if a string is encoded with escape() or encodeURIComponent()

I have a web service that receives data from various clients. Some of them sends the data encoded using escape(), while the others instead use encodeURIComponent(). Is there a way to detect the encoding used to escape the data?

Upvotes: 13

Answers (6)

atfede

Reputation: 411

Maybe not the most performant, but this function will recursively decode the encoded string until it cannot decode it anymore.

function decodeValue(str) {
    const decodedStr = decodeURIComponent(str);

    if (decodedStr === str) {
        return decodedStr; // Base case: no more decoding needed
    } else {
        return decodeValue(decodedStr); // String is encoded. Recur with the decoded value
    }
}

decodeValue("%253Ctable class='table-1'%253E%253Ctbody%253E%253Ctr%253E%253Ctd%253Esdfsd%253C/td%253E%253Ctd%253Esdfsd%253C/td%253E%253C/tr%253E%253Ctr%253E%253Ctd%253Esdfsd%253C/td%253E%253Ctd%253Esdfs%253C/td%253E%253C/tr%253E%253C/tbody%253E%253C/table%253E");

In this example the decodeValue function is called twice since the string was encoded two times.

function decodeValue(str) {
  const decodedStr = decodeURIComponent(str);

  if (decodedStr === str) {
    return decodedStr; // Base case: no more decoding needed
  } else {
    return decodeValue(decodedStr); // Recur with the decoded value
  }
}

let decodedString = decodeValue("%253Ctable class='table-1'%253E%253Ctbody%253E%253Ctr%253E%253Ctd%253Esdfsd%253C/td%253E%253Ctd%253Esdfsd%253C/td%253E%253C/tr%253E%253Ctr%253E%253Ctd%253Esdfsd%253C/td%253E%253Ctd%253Esdfs%253C/td%253E%253C/tr%253E%253C/tbody%253E%253C/table%253E");

document.write(decodedString);

table,
th,
td {
  border: 1px solid black;
}

body {
  font-size: 30px;
}

Upvotes: 0

Dudi

Reputation: 3079

Thanks for @mika for great answer. Maybe just one improvement since unescape function is considered as deprecated:

declare function unescape(s: string): string;


decodeURItoString(str): string {

 var resp = str;

 try {
    resp = decodeURI(str);
 } catch (e) {
    console.log('ERROR: Can not decodeURI string!');

    if ( (unescape != null) && (unescape instanceof Function) ) {
        resp = unescape(str);
    }
 }

return resp;

}

Upvotes: 3

Dejan Janjušević

Reputation: 3230

I realize this is an old question, but I am unaware of a better solution. So I do it like this (thanks to a comment by RobertPitt above):

function isEncoded(str) {
    return typeof str == "string" && decodeURIComponent(str) !== str;
}

I have not yet encountered a case where this failed. Which doesn't mean that case doesn't exists. Maybe someone could shed some light on this.

Upvotes: 14

mika

Reputation: 6972

This won't help in the server-side, but in the client-side I have used javascript exceptions to detect if the url encoding has produced ISO Latin or UTF8 encoding.

decodeURIComponent throws an exception on invalid UTF8 sequences.

try {
     result = decodeURIComponent(string);
}
catch (e) {
     result =  unescape(string);                                       
}

For example, ISO Latin encoded umlaut 'ä' %E4 will throw an exception in Firefox, but UTF8-encoded 'ä' %C3%A4 will not.

How to detect if a string is encoded with escape() or encodeURIComponent()

Answers (6)

See Also

Related Questions