KevOMalley743
KevOMalley743

Reputation: 581

UnicodeEncode error causes tracebook but still some output (Pandas list to .txt file)

I have created a list of lists containing both the column numbers and column names for a pandas dataframe.


print(example_list)

[[0, 'name'] [1, 'gender'] ... [85, 'HEXACO_12'] [86, 'general self efficacy']]

For ease of other people I'm working with I have written this file to a .txt file so they can get the variable name and it's position in the dataframe quickly.

df2_positions = ([list((i, df2.columns[i])) for i in range(len(df2.columns))])

with open('file_positions.txt', 'w') as f_positions:
    for i in example_list:
        f_positions.write(f'{i}\n')

When this runs, I get the .txt file in the right place and it is complete, but it throws the following error:

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-29-acec172b19f5> in <module>
     10 with open('file_positions.txt', 'w') as f_positions:
     11     for i in example_list:
---> 12         f_positions.write(f'{i}\n')

myfile.py in encode(self, input, final)
     17 class IncrementalEncoder(codecs.IncrementalEncoder):
     18     def encode(self, input, final=False):
---> 19         return codecs.charmap_encode(input,self.errors,encoding_table)[0]
     20 
     21 class IncrementalDecoder(codecs.IncrementalDecoder):

UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 85: character maps to <undefined>

I'm confused because I have checked the file and rather than finishing with [86, 'general self efficacy'] it finished with [55, 'When I see an opportunity for something I like, I get excited right away']. However, the traceback says the problem is with a character in item 85.

I have tried both manually and programtically removing any strange characters from the column names and I'm still getting the same error.

Can anyone help me figure out what's going on?

Thank you.

Upvotes: 0

Views: 32

Answers (1)

MarceloBaliu
MarceloBaliu

Reputation: 230

Probably the problem is in the next item ([56, '....']), beacause when the error occurs the current item is not written. So if it is inside the file, it is not the problem line.

Can you add this line here?

Upvotes: 1

Related Questions