flybonzai
flybonzai

Reputation: 3931

"UnicodeEncodeError: 'charmap' codec can't encode characters" triggered by CSV module

I just started a new job where we develop on Macs, but our server is a Windows server. So I've migrated my new code over there (which runs fine on Mac), and suddenly I'm getting this traceback:

Traceback (most recent call last):
  File ".\jira_oauth.py", line 260, in <module>
    writer.writerow(fields)
  File "C:\Anaconda3\lib\csv.py", line 153, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
  File "C:\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 1368-1369: character maps to <undefined>

I'll post the lines in my code that are leading up to it:

for cycle in range(pulls):
        # for cycle in range(pulls):
            logging.info('On cycle: {}'.format(cycle))
            data_url = DATA_URL + mk_data_endpoint(max_res, start_at)
            logging.info('Max Results: {} - Starting Point: {}'.format(
                max_res, start_at
            ))
            # Pull down data and transform into dictionary
            data = json.loads(o.get_data(data_url).decode('utf-8'))
            for issue in data['issues']:
                fields = issue['fields']
                fields['id'] = issue['id']
                fields['key'] = issue['key']
                clean_data(fields)
                split_entries(fields, 'project', 'project_key', 'project_name')
                fields_keys = list(fields.keys())
                for key in fields_keys:
                    if key in lookup.keys():
                        info = lookup.get(key)
                        val = fields.pop(key)
                        # The lookup table is a dictionary with the column
                        # names that come out of Jira as the key, and a tuple
                        # containing the corresponding column name in the
                        # first position, and optional nested levels that
                        # must be traversed to return the value we are looking
                        # for.
                        if len(info) <= 1 or not val:
                            fields[info[0]] = val
                        else:
                            fields[info[0]] = nested_vals(val,
                                                          info[1:],
                                                          key)
                # Add custom fields

                hash = md5()
                hash.update(json.dumps(fields).encode('utf-8'))
                try:
                    fields['time_estimate'] = int(fields['time_estimate'])
                except (KeyError, TypeError):
                    pass
                fields['etl_checksum_md5'] = hash.hexdigest()
                fields['etl_process_status'] = ETL_PROCESS_STATUS
                fields['etl_datetime_local'] = ETL_DATETIME_LOCAL
                fields['etl_pdi_version'] = ETL_PDI_VERSION
                fields['etl_pdi_build_version'] = ETL_PDI_BUILD_VERSION
                fields['etl_pdi_hostname'] = ETL_PDI_HOSTNAME
                fields['etl_pdi_ipaddress'] = ETL_PDI_IPADDRESS

                writer.writerow(fields)

Where the line it is dying on is the writerow line. I have the same versions of Python installed on both machines (using Anaconda3), and from other responses it looks like the issue was something to do with Windows being unable to print Unicode to the console. Where mine is dying in DictWriter I'm not so sure...

Upvotes: 2

Views: 1201

Answers (1)

flybonzai
flybonzai

Reputation: 3931

Solved by adding encoding=utf-8 to my open statement... So weird that it works intuitively for Mac, but Windows decides to suck.

Upvotes: 1

Related Questions