Reputation: 91
First of all I understand both JS escape()
and unescape()
are deprecated. Basically we have an ancient system which JS escape()
the data before storing in DB, every time we need to unescape()
the data on client side before we can display the actual data (I know it's stupid but it was done years ago to support Unicode characters on non-unicode compliant DB).
Is there any existing PHP implementation which simulate the JavaScript escape()
and unescape()
function?
Upvotes: 1
Views: 842
Reputation: 91
After some searching I was able to put together the two PHP functions which does what I want. The codes are not pretty but works 100% on the data we have so far so thought I would share them here.
/**
* Simulate javascript escape() function
*/
function escapejs($source) {
$map = array(
,'~' => '%7E'
,'!' => '%21'
,'\'' => '%27' // single quote
,'(' => '%28'
,')' => '%29'
,'#' => '%23'
,'$' => '%24'
,'&' => '%26'
,',' => '%2C'
,':' => '%3A'
,';' => '%3B'
,'=' => '%3D'
,'?' => '%3F'
,' ' => '%20' // space
,'"' => '%22' // double quote
,'%' => '%25'
,'<' => '%3C'
,'>' => '%3E'
,'[' => '%5B'
,'\\' => '%5C' // forward slash \
,']' => '%5D'
,'^' => '%5E'
,'{' => '%7B'
,'|' => '%7C'
,'}' => '%7D'
,'`' => '%60'
,chr(9) => '%09'
,chr(10) => '%0A'
,chr(13) => '%0D'
,'¡' => '%A1'
,'¢' => '%A2'
,'£' => '%A3'
,'¤' => '%A4'
,'¥' => '%A5'
,'¦' => '%A6'
,'§' => '%A7'
,'¨' => '%A8'
,'©' => '%A9'
,'ª' => '%AA'
,'«' => '%AB'
,'¬' => '%AC'
,'¯' => '%AD'
,'®' => '%AE'
,'¯' => '%AF'
,'°' => '%B0'
,'±' => '%B1'
,'²' => '%B2'
,'³' => '%B3'
,'´' => '%B4'
,'µ' => '%B5'
,'¶' => '%B6'
,'·' => '%B7'
,'¸' => '%B8'
,'¹' => '%B9'
,'º' => '%BA'
,'»' => '%BB'
,'¼' => '%BC'
,'½' => '%BD'
,'¾' => '%BE'
,'¿' => '%BF'
,'À' => '%C0'
,'Á' => '%C1'
,'Â' => '%C2'
,'Ã' => '%C3'
,'Ä' => '%C4'
,'Å' => '%C5'
,'Æ' => '%C6'
,'Ç' => '%C7'
,'È' => '%C8'
,'É' => '%C9'
,'Ê' => '%CA'
,'Ë' => '%CB'
,'Ì' => '%CC'
,'Í' => '%CD'
,'Î' => '%CE'
,'Ï' => '%CF'
,'Ð' => '%D0'
,'Ñ' => '%D1'
,'Ò' => '%D2'
,'Ó' => '%D3'
,'Ô' => '%D4'
,'Õ' => '%D5'
,'Ö' => '%D6'
,'×' => '%D7'
,'Ø' => '%D8'
,'Ù' => '%D9'
,'Ú' => '%DA'
,'Û' => '%DB'
,'Ü' => '%DC'
,'Ý' => '%DD'
,'Þ' => '%DE'
,'ß' => '%DF'
,'à' => '%E0'
,'á' => '%E1'
,'â' => '%E2'
,'ã' => '%E3'
,'ä' => '%E4'
,'å' => '%E5'
,'æ' => '%E6'
,'ç' => '%E7'
,'è' => '%E8'
,'é' => '%E9'
,'ê' => '%EA'
,'ë' => '%EB'
,'ì' => '%EC'
,'í' => '%ED'
,'î' => '%EE'
,'ï' => '%EF'
,'ð' => '%F0'
,'ñ' => '%F1'
,'ò' => '%F2'
,'ó' => '%F3'
,'ô' => '%F4'
,'õ' => '%F5'
,'ö' => '%F6'
,'÷' => '%F7'
,'ø' => '%F8'
,'ù' => '%F9'
,'ú' => '%FA'
,'û' => '%FB'
,'ü' => '%FC'
,'ý' => '%FD'
,'þ' => '%FE'
,'ÿ' => '%FF'
);
$convmap = array(0x80, 0x10ffff, 0, 0xffffff);
$org = $source;
// make sure string is UTF8
if (false === mb_check_encoding($source, 'UTF-8')) {
if (false === ($source = iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $source))) {
$source = $org;
}
}
$chrArray = preg_split('//u', $source, -1, PREG_SPLIT_NO_EMPTY); // split up the UTF8 string into chars
$oChrArray = array();
foreach ($chrArray as $index => $chr) {
if (isset($map[$chr])) {
$chr = $map[$chr];
}
// if char doesn't fall within ASCII then assume unicode, get the hex html entities
//elseif (mb_detect_encoding($chr, 'ASCII', true) !== 'ASCII') {
else {
$chr = mb_encode_numericentity($chr, $convmap, "UTF-8", true);
// since we will be converting the &#x notation to the non-standard %u for backward compatbility, make sure the code is 4 digits long by prepending 0p
if (substr($chr, 0, 3) == '&#x' && substr($chr, -1) == ';' && strlen($chr) == 7)
$chr = '�'.substr($chr, 3);
}
$oChrArray[] = $chr;
}
$decodedStr = implode('', $oChrArray);
$decodedStr = preg_replace('/&#x([0-9A-F]{4});/', '%u$1', $decodedStr); // we need to use the %uXXXX format to simulate results generated with js escape()
return $decodedStr;
}
/**
* Simulate javascript unescape() function
*/
function unescapejs($source) {
$source = str_replace(array('%0B'), array(''), $source); // stripe out vertical tab
$s= preg_replace('/%u(....)/', '&#x$1;', $source);
$s= preg_replace('/%(..)/', '&#x$1;', $s);
return html_entity_decode($s, ENT_QUOTES, 'UTF-8');
}
Upvotes: 3
Reputation:
You're looking for urlencode(). If the output of that encoding isn't acceptable to you, you can try rawurlencode().
This has more info:
http://php.net/manual/en/function.urldecode.php
http://php.net/manual/en/function.urlencode.php
But if you're just wanting to do decoding to store data into a mysql database, then you can use the built-in mysql escape string function which converts input into a decent output format that can be injected into a mysql database.
See:
http://php.net/manual/en/mysqli.real-escape-string.php
Upvotes: -1