user56reinstatemonica8
user56reinstatemonica8

Reputation: 34074

How are multiple spaces showing in HTML, without pre, nbsp or white-space: pre?

I'm sure this shouldn't be possible. Somehow, multiple spaces within paragraphs in my HTML markup aren't collapsing. They are not &nbsp;, not in a <pre> tag, not set to white-space: pre-wrap; or white-space: pre; and the behaviour is not changed by forcing style="white-space: normal;" on the element.

My understanding is, that these are the only three ways that whitespace can be preserved and that two or more spaces can be made to show up in HTML.

So the question is: What else could cause sequential whitespace to show up as multiple spaces? There must be something else - but I can't look for it until I know what it is, and every source I find just talks about &nbsp;, <pre> and white-space: pre-wrap; or white-space: pre;

Key edit: Using Firebug I tried deleting some of the offending whitespace and typing it back in again. When deleted and re-entered from the keyboard, the spaces behave as normal - no unexpected whitespace in the browser. So it must be some character that shows up in view source, text editors, etc as a plain space, but actually behaves like &nbsp;. What can it be, and crucially, how can I identify it to remove it? The original source of the offending input is the wysiwyg editor TinyMCE so I'm adding that tag...


More detail: I have some HTML containing paragraph text that includes multiple spaces, like this (between the ...s is copied direct from Firefox view source):

<p> blah blah.... nothing  more  than  a ... blah blah </p>

As you can see, these are regular spaces, not &nbsp;. The document has &nbsp; elsewhere and they show up as such in view source, so they're not &nbsp;s somehow masquerading as normal spaces in the source.

Also, the CSS is not set to white-space: pre; or anything like that:

There are no <pre> tags in the document, anywhere. Find on <pre in the source finds nothing.

So it should therefore show in the browser with one space between each word. It doesn't. It shows multiple spaces as if it was <pre> or &nbsp; or white-space: pre;. But it isn't any of these.

There must be some other way of getting a white-space: pre; like effect I don't know about. What other ways to preformat whitespace and stop multiple spaces collapsing are there? What could possibly be causing this.


A few background notes:

Upvotes: 4

Views: 6109

Answers (1)

TachyonVortex
TachyonVortex

Reputation: 8572

I'm guessing that the offending space characters in your source code are not SPACE (U+0020), but are actually NO-BREAK SPACE (U+00A0). Visually, they appear identical, but if you view your source code in a hex editor (which shows you the individual bytes in the file), you'll see a different encoding for these characters.

Edit 1

This PHP code should find and replace the offending characters with regular spaces:

$strNoBreakSpace = mb_convert_encoding('&#x00A0;', 'UTF-8', 'HTML-ENTITIES');
$strNormalSpace  = mb_convert_encoding('&#x0020;', 'UTF-8', 'HTML-ENTITIES');

$strInput = str_replace( $strNoBreakSpace, $strNormalSpace, $strInput );

Edit 2

A simpler way of creating the two space characters:

$strNoBreakSpace = json_decode('"\u00A0"');
$strNormalSpace  = json_decode('"\u0020"');

Upvotes: 6

Related Questions