Reputation: 251
Suppose I have the data below. Even I used #coding=utf-8 to define the default encoding, the output is still showing me : ??? instead of the Chinese string.
#coding=utf-8
import pandas as pd
df = pd.DataFrame({ '日期' : ['2015-01-07', '2014-12-17', '2015-01-21', '2014-11-19', '2015-01-17', '2015-02-26', '2015-01-04', '2014-12-20', '2014-12-07', '2015-01-06'],
'股票代码': ['600795', '600268', '002428', '600031', '002736', '600216', '000799', '601600', '601939', '000898']
})
print df
Upvotes: 0
Views: 2598
Reputation: 879441
Try adding
pd.options.display.encoding = sys.stdout.encoding
near the top of your file. By default, pandas encodes unicode with utf-8
when encoding strings.
Python sets sys.stdout.encoding
to the encoding it detects your console or terminal is using.
import sys
import pandas as pd
pd.options.display.encoding = sys.stdout.encoding
df = pd.DataFrame(
{'日期' : ['2015-01-07', '2014-12-17', '2015-01-21', '2014-11-19',
'2015-01-17', '2015-02-26', '2015-01-04', '2014-12-20',
'2014-12-07', '2015-01-06'],
'股票代码': ['600795', '600268', '002428', '600031', '002736', '600216',
'000799', '601600', '601939', '000898']})
print(df)
Note that even though you defined the columns with strings, Pandas converts them to unicode:
In [158]: df.columns
Out[158]: Index([u'日期', u'股票代码'], dtype='object')
This is why when you print(df)
Pandas is using pd.options.display.encoding
to encode these values.
Upvotes: 1