Jiew Meng
Jiew Meng

Reputation: 88187

URL/HTML escaping/encoding

I have always been confused with URL/HTML encoding/escaping. I am using PHP, so I want to clear some things up.

Can I say that I should always use

Would there be any other places I might use each function? I am not good at all these escaping stuff and am always confused by them.

Upvotes: 20

Views: 31762

Answers (3)

ircmaxell
ircmaxell

Reputation: 165193

First off, you shouldn't be using htmlentities() around 99% of the time. Instead, you should use htmlspecialchars() for escaping text for use inside XML and HTML documents.

htmlentities are only useful for displaying characters that the native character set you're using can't display (it is useful if your pages are in ASCII, but you have some UTF-8 characters you would like to display). Instead, just make the whole page UTF-8 (it's not hard), and be done with it.

As far as urlencode(), you hit the nail on the head.

So, to recap:

  • Inside HTML:

    <b><?php echo htmlspecialchars($string, ENT_QUOTES, "UTF-8"); ?></b>
    
  • Inside of a URL:

    $url = '?foo=' . urlencode('bar');
    

Upvotes: 34

troelskn
troelskn

Reputation: 117417

That's about right. Although - htmlspecialchars is fine, as long as you get your charsets straight. Which you should do anyway. So I tend to use that, so I would find out early if I had messed it up.

Also note that if you put a URL into an HTML context (say - in the href of an a-tag), you need to escape that. So you'll often see something like:

echo "<a href='" . htmlspecialchars("?foo=" . urlencode($foo)) . "'>clicky</a>"

Upvotes: 19

Dharman
Dharman

Reputation: 33238

If you are building a query string for your URL, then it's best to just use http_build_query() instead of manually encoding each part.

$params = [
    'param1' => 'some data',
    'param2' => 'something else',
];

echo '<a href="https://test.com?'.htmlspecialchars(http_build_query($params)).'">Link</a>';

All output in HTML should be HTML encoded too, despite there being a very tiny chance your URL, which is properly encoded, will break the HTML.

Upvotes: 0

Related Questions