user1590642
user1590642

Reputation: 199

PHP / ASCII serious mind boggling situation with non-printable characters

Can someone tell me why those two tags are doing different things? (first string doesnt work when upload it to server, second is fine)

<a href="http://www.example.com">a</a>
<a href="http://www.example.com">a</a>

I ran conversion to HEX values and it seems there is at least one character more in first string:

3c6120687265663d223f687474703a2f2f7777772e6578616d706c652e636f6d223e613c2f613e0d0a
3c6120687265663d22  687474703a2f2f7777772e6578616d706c652e636f6d223e613c2f613e

Second string is handwritten, first is generated by PHP function:

<?php
$handle = @fopen("./data/test.txt", "r");
$homepage = trim(fgets($handle, 4096));
?>

<a href="<?php echo $homepage;?>">a</a>

in test.txt, there is:

http://www.example.com

on the first line, then few more lines of text.

Moreover, the code for invisible character seems to be 3f, which is question mark, that should be visible, right?

Upvotes: 0

Views: 327

Answers (1)

zerkms
zerkms

Reputation: 254944

There is a BOM in the file: EF BB BF.

The correct solution is to fix the algorithm that puts the data to the file (to not put BOM), or if the file is static - just remove it once with any advanced enough text editor (like Notepad++ or so), but as a quick and dirty solution you can just remove it in runtime:

if (substr($homepage, 0, 3) == pack('CCC', 0xef, 0xbb, 0xbf)) {
    $homepage = substr($homepage, 3);
}

Upvotes: 3

Related Questions