Reputation: 18325
In PHP when i read the Data, lets say the data (chunk of string) is containing HTML Special Character DECIMAL HEX Codes like:
This is a sample string with < œ < and š
What i want is, how to Detect and Split out the Decimal Hex Codes (of any Special Characters) inside a chunk of string?
For example, above string contains:
<
œ
š
How can i programatically detect it (The OCCURRENCE for any Html Special Characters)?
(Collected results will be better as an Array)
Upvotes: 0
Views: 882
Reputation: 156
You should use preg_match() - http://www.php.net/manual/en/function.preg-match.php with pattern like this '/&[0-9a-zA-Z]{1,5};/g'.
[Updated]: Note what entities you need. Is that just &#x[number][number][number];
or all possible html-entities (like
, <
e.t.c.)?
Above I described the most common case.
Upvotes: 1
Reputation: 3816
You could use substr and strpos to find &#
and skip to the next ;
:
$string = "This is a sample string with œ and š"
$hexCodes = array();
while (strlen($string) > 0) {
if (strpos("&#") > 0) {
$string = substr($string, strpos("&#"));
$hex = substr($string, 0, strpos(";") + 1);
$string = substr($string, strpos(";") + 1);
array_push($hexCodes, $hex);
}
else { break; }
}
Upvotes: 1
Reputation: 21856
I think this is what you are after:
$s = 'This is a sample string with œ and š';
$pattern = '/\&#x\d+\;/';
preg_match_all($pattern, $s, $matches);
var_dump( $matches );
This will output:
array(1) {
[0]=>
array(2) {
[0]=>
string(7) "œ"
[1]=>
string(7) "š"
}
}
Upvotes: 3
Reputation: 144
If you mean to decode the entities, use html_entity_decode. Here is an example:
<?php
$a = "I'll "walk" the <b>dog</b>";
$b = html_entity_decode($a);
echo $b; // I'll "walk" the <b>dog</b> now
?>
Upvotes: -2