Reputation: 2022
I have a Unicode, which is read from a CSV file:
df.iloc[0,1]
Out[41]: u'EU-repr\xe6sentant udpeget'
In [42]: type(df_translated.iloc[0,1])
Out[42]: unicode
I would like to have it as EU-repræsentant udpeget
. The final goal is to write this into a dictionary and then finally save that dict to a YAML file with PyYAML
using safe_dump
. However, I struggle with the encoding.
Upvotes: 2
Views: 2398
Reputation: 76568
If you really need to use PyYAML you should provide the arguments
encoding='utf-8'
and allow_unicode=True
to the safe_dump()
routine.
If you ever intend to upgrade to YAML 1.2 and use ruamel.yaml (disclaimer: I am the author of that package), those are the (much more sensible) defaults:
import sys
import ruamel.yaml
yaml = ruamel.yaml.YAML()
data = [u'EU-repr\xe6sentant udpeget']
yaml.dump(data, sys.stdout)
which gives:
- EU-repræsentant udpeget
Upvotes: 3