Reputation: 105
I have a excel file with 119 sheets that I would like to get the data to plot multiple graphs. The problem is that the numeric values have comma as a decimal separator. And I have read that, differently from read_csv, the read_excel function in Pandas does not have this option.
I intend to load specific columns from some of selected sheets and build a merged graph using a for loop.
One of the alternatives that I saw (but it is kind of an overkill) is to transform each of the sheets into a specific csv file and load them and use the option of comma as decimal separator). Is there any alternative that I can load the data correctly without having to recourse to this alternative?
Thanks!
Upvotes: 4
Views: 11837
Reputation: 3926
What you are looking for is decimal
.
df = pd.read_excel('file.xlsx', decimal=',')
Upvotes: 0
Reputation: 24314
Well unlike read_csv()
method read_excel()
method doesn't support decimal
parameter:
But after loading your dataset you can use:
df = pd.read_excel('yourexcel.xlsx')
#loading dataset
cols=#your list of column that you want to convert
df[cols]=df[cols].replace(',','.',regex=True).astype(float)
OR
Other way is to create a function and use converters
parameter in read_excel()
method:
def typecast_float(value):
try:
return float(value.replace(',', '.'))
except:
return value
#Finally:
df=pd.read_excel("sample.xlsx", converters={'column_name': typecast_float})
Upvotes: 2
Reputation: 71580
Just modify the thousands separator:
df = pd.read_excel('yourexcel.xlsx', thousands=',')
Upvotes: 0