Joost
Joost

Reputation: 394

Non-breaking space removed by Text API

I'm using the Microsoft Translator Text API to translate parts of a webpage. The platform we use, inserts   in the HTML to render empty lines. So a part of the webpage can be:

<p>
  <span>This is a dummy text</span>
</p>
<p>
  <span>&nbsp;</span>
</p>

When I send this to the Microsoft Translator Text API, it returns the following HTML:

<p>
  <span>Il s’agit d’un texte factice</span>
</p>
<p>
  <span></span>
</p>

I've set the content type to text/html, and escape the HTML characters to be able to send it to the API (so &nbsp; will be replaced with &amp;nbsp;). But the text that is returned by the API has completely lost the &nbsp;.

How can I prevent the API from removing the &nbsp; instances in the HTML? Or is this a bug in the API?

Upvotes: 2

Views: 1303

Answers (2)

Paul Dempsey
Paul Dempsey

Reputation: 663

See the answer to Microsoft Translator API - notranslate trimming leading space? from Chis Wendt (Microsoft):

Translator trims leading and trailing space, and compresses any other white space to a single space. This is by design. Translator needs to move the words around freely to form the newly composed sentence, and wouldn't know what to do with the extra white space. A workaround would be to trim in your code before translation, and then restore the trimmed off pieces afterwards, depending on the context.

Line breaks and non-breaking spaces tend to be used for specific line layout based on the particular source text that would need to be laid out differently in another language in any case because of different word lengths and arrangements of the significant words.

Upvotes: 1

Microsoft Translator
Microsoft Translator

Reputation: 278

A notranslate span may help to prevent translation. You would have to try it to see if it does indeed preserve the nbsp tag.

Upvotes: 1

Related Questions