jefftimesten
jefftimesten

Reputation: 366

Python: unicode in system commands

Suppose I have a mysterious unicode string in Python (2.7) that I want to feed to a command line program such as imagemagick (or really just get it out of Python in any way). The strings might be:

So in Python I might make a little command like this:

cmd = u'convert -pointsize 24 label:"%s" "%s.png"' % (name, name)

If I just print cmd and get convert -pointsize 24 label:"Jörgen Jönsson" "Jörgen Jönsson.png" and then run it myself, everything is fine.

But if I do os.system( cmd ), I get this:

I know it's not an imagemagick problem because the filenames are messed up too. I know that Python is converting the command to ascii when it passes it off to os.system, but why is it getting the encoding so wrong? Why is it interpreting each non-ASCII character as 2 characters? According to a few articles that I've read, it might be because it's encoded as latin-1 but it's being read as utf-8, but I've tried encoding it back and forth between them and it's not helping.

I get Unicode exceptions when I try to just encode it manually as ascii without a replacement argument, but if I do name.encode('ascii','xmlcharrefreplace'), I get the following:

I'm hoping that someone recognizes this particular kind of encoding problem and can offer some advice, because I'm about out of ideas.

Thanks!

Upvotes: 5

Views: 6842

Answers (1)

jterrace
jterrace

Reputation: 67063

Use subprocess.call instead:

>>> s = u'Jörgen Jönsson'
>>> import subprocess
>>> subprocess.call(['echo', s])
Jörgen Jönsson
0

Upvotes: 15

Related Questions