Harshavardhan Ramanna
Harshavardhan Ramanna

Reputation: 738

Error reading excel file written on Ubuntu machine in Windows (using Pandas)

I am using Pandas for data analysis. In one of the programs, I had to write an Excel sheet and I accomplished using pd.to_excel(). This was executed on an Ubuntu machine.

Now I was trying to read the written excel file on a Windows machine. I am using pd.read_excel().

I am getting the following error:

IOError: [Errno 22] invalid mode ('rb') or filename: 'C:\\Users\\Harshavardhan R\\Downloads\\Kerala Energy Project\\Bottom-Up Modelling\threshold variation\\outputs\x07c1.xlsx'

The source is:

df = pd.read_excel('C:\Users\Harshavardhan R\Downloads\Kerala Energy Project\Bottom-Up Modelling\threshold variation\outputs\ac1.xlsx')

I can vouch that the file is present as I have opened it and checked.

Why is the file name changed to x07c1.xlsx in the error message? How would I avoid this problem?

Upvotes: 2

Views: 1398

Answers (1)

Andy
Andy

Reputation: 50570

The error you are getting is due to how Windows normally expects file paths to look (with the \) versus how Ubuntu expects them (with /).

Your path, as written, has escape sequences, specifically a \t (tab) and \a (BEL).

The second one is why you see \x07c1.xlsx. The \x07 is the ASCII representation of BEL. If you look closely at the path you'll also notice a hidden tab character as well. This can be seen when there is only one slash instead of two:

\\Bottom-Up Modelling\threshold variation\\
                     ^^ This is a tab, not a "t"

You can fix this by doing a one of a couple different things. The easiest, is to make the path a raw string. Do this by putting an r in front of your string:

df = pd.read_excel(r'C:\Users\Harshavardhan R\Downloads\Kerala Energy Project\Bottom-Up Modelling\threshold variation\outputs\ac1.xlsx')

This reads the string literally as it exists. One note: A raw string can not end with a single backslash.

Another option is to escape your backslashes by making each backslash into two:

df = pd.read_excel('C:\\Users\\Harshavardhan R\\Downloads\\Kerala Energy Project\\Bottom-Up Modelling\\threshold variation\\outputs\\ac1.xlsx')

This works, but requires that you perform this step every time you use the path.

A final option is to use forward slashes. Windows can handle forward slashes just fine, it's just not the standard convention of the operating system

df = pd.read_excel('C:/Users/Harshavardhan R/Downloads/Kerala Energy Project/Bottom-Up Modelling/threshold variation/outputs/ac1.xlsx')

Personally, I'd go with the raw string. It's a single character and the path doesn't "look weird" to you or other developers.

Upvotes: 3

Related Questions