EVA ZHAO
EVA ZHAO

Reputation: 23

how to read in csv file with Chinese characters in python

The csv file I have messy code which is supposed to be chinese characters. I want to read the file into python with the chinese characters not messy as before. How do I do that? I tried pandas.read_csv with encoding like gb2312 or gb18030, they all report error like UnicodeDecodeError: 'gb2312' codec can't decode byte 0xae in position 4: illegal multibyte sequence

My data: The data

CODE NAME LISTDATE FOUNDDATE TIME DATE EPTTM INDUSTRY LISTCITY 000001.SZ Âπ≥ÂÆâÈì∂Ë°å 3/4/1991 19871222 8 1/1/2007 0.030477768 Ω»⁄∑˛ŒÒ …Ó€⁄ 000002.SZ ‰∏áÁßëA 29/1/1991 19840530 8 1/1/2007 0.025771537 ∑øµÿ≤˙ …Ó€⁄ 000004.SZ ÂõΩÂÜúÁßëÊäÄ 14/1/1991 19860505 8 1/1/2007 -0.05297144 “Ω“©…˙ŒÔ …Ó€⁄ 000005.SZ ‰∏ñÁ∫™ÊòüÊ∫ê 10/12/1990 19870730 8 1/1/2007 -0.024968897 ∑øµÿ≤˙ …Ó€⁄ 000006.SZ Ê∑±Êå؉∏öA 27/4/1992 19850525 8 1/1/2007 0.074647402 ∑øµÿ≤˙ …Ó€⁄ 000007.SZ ÂÖ®Êñ∞•Ω,13/4/1992 19830311 NA 8 1/1/2007 NA ∑øµÿ≤˙ …Ó€⁄ 000008.SZ Á•ûÂ∑ûÈ´òÈìÅ 7/5/1992 19891011 8 1/1/2007 -0.010574387 ◊€∫œ …Ó€⁄ 000009.SZ ‰∏≠ÂõΩÂÆùÂÆâ 25/6/1991 19830706 8 1/1/2007 0.009576133 ∑øµÿ≤˙ …Ó€⁄

Upvotes: 1

Views: 3178

Answers (1)

M.Bonjour
M.Bonjour

Reputation: 1152

data06_16 = pd.read_csv("yourfile.csv", encoding="GBK")

Try adding encoding equals to GBK, it work well.

as the screenshot.

enter image description here

Upvotes: 1

Related Questions