MBaas
MBaas

Reputation: 7530

Has PHP's behaviour changed?

Studying for the ZEND-CE exam, I came across this question:

Given a php.ini setting of:
default_charset = utf-8
What will the following code print in the browser?

<?php  

header('Content-Type: text/html; charset=iso-8859-1');  

echo '&#9986;&#10004;&#10013;';  

?>

A. Garbled data
B. & # 9986 ; & # 10004 ; & # 10013 ;
C. A blank line due to charset mismatch

The expected answer is C, I expected it to be A - and when I ran that code, I got garbled data (Answer A)! So I wonder if PHPs behaviour had been changed recently or if this is an error in the test?

Upvotes: 1

Views: 119

Answers (1)

Oswald
Oswald

Reputation: 31655

I am not aware that PHP behaviour has changed in that respect. However, the HTML standard has changed.

Prior to HTML 4, numeric character references such as &#9986; where interpreted with respect to the document character set (which is specified in the Content Type header field). It is reasonable that, as the code point 9986 does not exist in ISO 8859-1, nothing would be printed.

Since HTML 4, numeric character references are interpreted as Unicode code points. So echo '&#9986;&#10004;&#10013;'; should print ✂✔✝ regardless of what the content type header field says about the character set. It is reasonable to call ✂✔✝ Garbled data, if one is not familiar with the Unicode Dingbats block.

Upvotes: 2

Related Questions