Qohelet
Qohelet

Reputation: 1616

Parsing of CSV-File leads to "NBSP" and "SPA" Characters in Windows

I am parsing a CSV document to write a specification in SysML v2, which is essentially just a simple text file. Parsing it on Linux I get the result I desire. I use a SysOn-Docker to display my SysML v2 file and everything works as it's supposed to.

However, when I parse create the file in Windows there are special characters appearing:

NBSP and SPA Characters

It seems, due to these characters the SysOn Docker can't properly read the file in the Windows Docker (however, under Linux no issues at all).

I have tried several ways to write the file differently:

with open(filename, "w") as text_file:
    text_file.write(systring)

with codecs.open(filename, "w", "utf-8-sig") as text_file:
    text_file.write(systring)

with io.open(filename, 'w', encoding="utf-8) as file:
    text_file.write(systring)

However, all with the same result. The file doesn't change.

Right now I'm really considering removing all of these special characters manually with a .replace - but it doesn't seem to be the proper way?

Upvotes: 0

Views: 14

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 178021

It looks like your text viewer is Notepad++. NBSP is U+00A0 NO-BREAK SPACE and SPA is U+0096 a C1 control character START OF GUARDED AREA. In many text viewers (and probably your Linux one) NBSP displays as a space and control characters are often zero-width and invisible, but Notepad++ makes them visible.

Upvotes: 0

Related Questions