flybonzai
flybonzai

Reputation: 3941

CSV Module "UnicodeEncodeError" when using Dictwriter.writerows

I'm setting up a prod environment on a new Mac server that should mirror my dev environment. The job runs without a hitch on my dev computer, but on the server I'm getting this traceback:

Traceback (most recent call last):
  File "/usr/local/share/Code/PycharmProjects/etl3/jira_scripts/jira_issues_incremental.py", line 189, in <module>
    writer.writerows(rows)
  File "/usr/local/bin/anaconda3/envs/etl3/lib/python3.5/csv.py", line 156, in writerows
    return self.writer.writerows(map(self._dict_to_list, rowdicts))
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 1195: ordinal not in range(128)

This job is being run through the Run Shell Script terminal in the Automator App. I've checked the sys.defaultencoding() in the Automater terminal, as well as on the machine itself. Everything says utf8. I've also checked the encoding in my PostgreSQL database, and that is also set to UTF8. Here is my open statement for the file that the Dictwriter is writing to:

    with open(loadfile, 'w') as outf:
        writer = csv.DictWriter(
            f=outf,
            delimiter='|',
            fieldnames=fieldnames,
            extrasaction='ignore',
            escapechar=r'/',
            quoting=csv.QUOTE_MINIMAL
        )
        writer.writerows(rows)

I'm a little stumped as to where to even start to track down this error since all the default encodings seem to be correct... I should mention that this file is then copied to a PostgreSQL database using the psycopg2.cursor.copy_from command after, so the file should be written in a mode compatible with that.

Upvotes: 1

Views: 167

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1123890

You did not specify an encoding for your file, so the default codec is used for your system. Currently that is ASCII. See the open() documentation:

In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.

Specify a different codec instead. UTF-8 would work:

with open(loadfile, 'w', encoding='utf8') as outf:

sys.getdefaultencoding() doesn't apply here; that's merely the default for unqualified str.encode() calls.

Upvotes: 2

Related Questions