Ethan Hines
Ethan Hines

Reputation: 21

difference between "	" and nbsp; or " "

Hello I am trying to compile an EPUB v2.0 with html code extracted from Indesign. I have noticed there are a lot of "special characters" either at the beginning of a paragraph or at the end. For example

<p class="text_indent0px font_size0_8em line_height1_325 margin_bottom1px margin_left0px margin_right0px sans_serif floatleft">E<span class="small_caps">VELYNE</span>&#9;</p>

What is this

&#9; 

and can I either get rid of it or replace it with a "nbsp;"?

Upvotes: 2

Views: 2531

Answers (6)

Cylian
Cylian

Reputation: 11182

There are four types of character reference scheme used.

  1. Using decimal character codes (regex-pattern: &#[0-9]+;),
  2. Using hexadecimal character codes (regex-pattern: &#x[a-f0-9]+;),
  3. Using named character codes (regex-pattern: &[a-z]+;),
  4. Using the actual characters (regex-pattern: .).

Al these conversions are rendered same way. But, the coding style is different. For example, if you need to display a latin small letter E with diaeresis then you could use any of the below convention:

  1. &#235; (decimal notation),
  2. &#xEB; (hexadecimal notation),
  3. &euml; (html notation),
  4. ë (actual character),

Likewise, as you said, what should be used (a) &#9; (decimal notation) or (b) &nbsp; (html notation) or (c) &#32; (decimal notation).

So, from the above analogy, it can be said that the (a), (b) and (c) are three different kind of notation of three different characters.

And, this is for your information that, (a) is a Horizontal Tab, the (b) one is the non-breaking space which is actually &#160; in decimal notation and the (c) is the decimal notation for normal space character.

Now, technically space at the end of the paragraph, is nothing but meaningless. Better, you could discard those all. And if you still need to use space inside <pre> elements, not in <p> or <div>.

Hope this helps...

Upvotes: 0

Samir Das
Samir Das

Reputation: 1908

&nbsp; is the entity used to represent a non-breaking space

&#32; decimal char code of space what we enter using keyboard spacebar

&#9; decimal char code of horizontal tab

&nbsp; and &#9; both represent space but &nbsp; is non-breaking means multiple sequential occurrence will not be collapsed into one where as for the same case, ` will collapse to one space

&#9; = approx. 4 &nbsp; spaces and approx. 8 &#32; spaces

Upvotes: 0

Ruan Mendes
Ruan Mendes

Reputation: 92274

In the HTML encoding &#{number}, {number} is the ascii code. Therefore, &#9; is a tab which typically condenses down to one space in HTML, unless you use CSS (or the <pre> tag) to treat it as pre formatted text.

Therefore, it's not safe to replace it with a non-breaking or a regular space unless you can guarantee that it's not being displayed as a tab anywhere.

div:first-child {
    white-space: pre;   
}
<div>&#9; Test</div>
<div>&#9; Test</div>
<pre>&#9; Test</pre>

See https://developer.mozilla.org/en-US/docs/Web/CSS/white-space and http://ascii.cl/

Upvotes: 0

Rahul Tripathi
Rahul Tripathi

Reputation: 172408

&#9; represents the horizontal tab

Similarly &#32; represent space.

To replace &#9; you have to use &nbsp;&nbsp;&nbsp;&nbsp;

Upvotes: 0

Aibrean
Aibrean

Reputation: 6412

That would be a horizontal tab (i.e. the same as using the tab key).

If you want to replace it, I would suggest doing a find/replace using an ePub editor like Sigil (http://sigil-ebook.com/).

Upvotes: 0

Jakub Juszczak
Jakub Juszczak

Reputation: 7827

&#9

Is the ascii code for tabs. So I guess the paragraphs were indented with tabs.

If you want to replace them with &nbsp; then use 4 of them

&nbsp;&nbsp;&nbsp;&nbsp;

Upvotes: 1

Related Questions