Reputation: 21
I am trying to read excel file using pandas pd.read_excel and data in cells contains text values as formula errors. My requirement is to read the formula errors as text and not replace it with NA
Example: #NA, #DIV/0! #Value
While reading this data using pd.read_excel all the data is converted to NA instead of reading the data as is.
I did try using openpyxl but for my overall requirement, pandas read excel is apt because I get excel files with different formats.
import pandas as pd
xlsx = pd.ExcelFile('Book1.xlsx')
df = pd.read_excel(xlsx, sheet_name ='Sheet1' ,engine_kwargs={"values_only":True})
1,0, #DIV/0!(Formula Error), #DIV/0! (Text)
LHS RHS Formula Ftext 0 1 0 NaN NaN
Upvotes: 1
Views: 137
Reputation: 261830
#NA
should be interpreted as NaN, but not #DIV/0!
/#Value
If you don't want read_excel
to parse NaNs you have several options:
pd.read_excel('your_file.xlsx', na_filter=False)
Which will keep all strings intact (including empty strings).
Or a combination of keep_default_na
and na_values
. For example to ignore everything as NaN except empty strings:
pd.read_excel('your_file.xlsx', keep_default_na=False, na_values=[''])
Upvotes: 0