Abhishek Jain
Abhishek Jain

Reputation: 65

How to read special characters in the text file using pandas.to_excel()?

I have huge text file which i want to export to the excel by first doing some operations by making it a dataframe using Python.

Now, the file contains some special characters in one of the Header which is why i am not able to export that header line data from the DataFrame to the excel. Its is something like this

{"ÿþ""DOEClientID""",DOEClient,ChgClientID,ChgClient,ChgSystemID,ChgSystem}

I am able to export the data when i use {header = False} property but it shows some error when i make this header property TRUE

Please Help me Out with , I have searched a lot but not able to find any solution. I need those headers in the file.

COde: `def files(file_name, outfile_name): data_initial = open(path + file_name, "rU") data1 = csv.reader((line.replace('\0','') for line in data_initial), delimiter=",")

reader = csv.reader(open(path + file_name, 'rU'))
writer = csv.writer(open(path + outfile_name ,'wb'),dialect = 'excel')
for row in data1:
    writer.writerow(row)

df = pd.DataFrame(pd.read_csv(path + outfile_name,sep=',', engine='python'))

final_frame = df.dropna(how='all')

file_list = list(uniq(list(final_frame['DOEClient'])))

return file_list, final_frame`

Upvotes: 0

Views: 5206

Answers (1)

EdChum
EdChum

Reputation: 394159

The problem with your input file is that it has a utf-16 little endian BOM this is why you see the funny characters: ÿþ which is 0xFFFE but is being displayed using ISO-8859-1.

So you just need to pass the param encoding=utf-16' in order to be able to read the file fine:

df = pd.read_csv(path_to_csv, encoding='utf-16')

Upvotes: 2

Related Questions