richwol
richwol

Reputation: 1165

PHP - incorrect string length caused by HTML-encoded quote

I have the following string coming from a database: Let's Get Functional

If I run this through strlen it returns 25 characters instead of the expected 20. A var dump shows that the string looks like the above (no html references etc).

If I remove the single quote strlen returns 19 characters.

Obviously the quote is returning 5 characters instead of 1 - why? How can I stop this?

Thanks!

Upvotes: 2

Views: 9214

Answers (5)

deformhead
deformhead

Reputation: 109

The HTML entity name is ' for ', that equals to 5 chars, so your strlen() result is perfectly correct. You don't see HTML references because your browser is rendering them. Open the page source to see the actual PHP output.

To avoid this problem you should refrain from using htmlspecialchars() function or equivalent encoding on input (as it must be only used upon the output in HTML context).

As a temporary workaround you can apply html_entity_decode() before strlen().

Upvotes: 5

a basnet
a basnet

Reputation: 17

I have had same problem as you , may be this will help someone.

The single quote was converted to "& #39;" which was giving me incorrect result. simply replacing the string with single quote have solved my problem.

$string = "Let's Get Functional"; //string from POST or Database

echo strlen($string); //25

$string = str_replace("'", "'",$string); echo strlen($string); //20

Upvotes: -1

Vladimir Kirilov
Vladimir Kirilov

Reputation: 59

As @deformhead already explained, it seems that your apostrophe has been converted to the HTML ' string. My guess would be that between getting the string out of the database and calling strlen() on it you call htmlentities() somewhere in-between.

You can also check how many characters you get from the database in your select query with CHAR_LENGTH() (MySQL).

Another issue you might consider is that strlen() does not work well for multibyte characters so if you'll be working with non-ASCII characters then you'd better use mb_strlen() with the correct encoding. This case however would not explain the difference of 5 characters in your result (strlen() counts the bytes and not characters in a string).

Hope that helps.

Upvotes: 1

srain
srain

Reputation: 9082

It can not be.

<?php
$str = "Let's Get Functional";
echo strlen($str), "\n"; // 20

Look at code output here.

how to debug?

print the ASCII code of each char:

$str = "Let's Get Functional";
$len = strlen($str);
for ($i = 0; $i < $len; $i++)
{
    echo "$i\t", ord($str[$i]), "\n";
}

this is the result:

0   L       76
1   e       101
2   t       116
3   '       39
4   s       115
5           32
6   G       71
7   e       101
8   t       116
9           32
10  F       70
11  u       117
12  n       110
13  c       99
14  t       116
15  i       105
16  o       111
17  n       110
18  a       97
19  l       108

Upvotes: 1

Gravy
Gravy

Reputation: 12445

<?php

$string = "Let's Get Functional";

echo strlen($string);

?>

This code returns 20 characters.

Upvotes: -1

Related Questions