Unexpected character when writting to Excel using Pandas

Question

I have a dictionary like this:

film = {
    'ID': [],
    'Name': [],
    'Run Time': [],
    'Genre': [],
    'link': [],
    'name 2': []
}

Then I populate it in a for loop, like this:

film['ID'].append(film_id)
film['Name'].append(film_name)
film['Run Time'].append(film_runtime)
film['Genre'].append(film_genre)
film['link'].append(film_link)
film['name 2'].append(film_name2)

Then I convert the dictionary to a Pandas DataFrame, so that I can write it to an .xlsx file. Now before I actually write it, I print it to check the values of Run Time column. And everything is OK:

output_df = pd.DataFrame(film).set_index('ID')
print(output_df['Run Time'])

output:
ID
102    131
103     60
104       
105       
Name: Run Time, dtype: object

But then, when I write it, like this:

writer = ExcelWriter('output.xlsx')
output_df.to_excel(writer, 'فیلم')
writer.save()

The file looks like this:

As you can see, there's an extra ' (single quote) character in the file. This character is not visible. But I can highlight it:

And if I remove it, the number goes RTL:

So I thought the invisible character was LTR MARK (\u200E). I removed it like this:

film['Run Time'].append(film_runtime.replace('\u200E', ''))

But nothing happened, and the character is still there.

How can I fix this?

quamrana · Accepted Answer

You need to make sure that cells that need to be numbers are converted to numbers (typically ints) before converting to an .xlsx file.

In your case just:

film['Run Time'].append(int(film_runtime))

Unexpected character when writting to Excel using Pandas

Answers (2)

Related Questions