Reputation: 135
Due to reasons that are not important here, I have an excel sheet where my numbers have commas for decimal points as well as thousands separators. The only way to differentiate if the comma is supposed to be a decimal or a thousands separator, is the number of digits after the comma:
decimal = 7,29
thousands = 23,767,209
My approach, which works fine for my case, is to read in the numbers as strings and execute the following code:
strings = ["0", "87,390,112", "78", "9,27", "1"]
strings_new = []
for i in strings:
if len(i) >= 3:
# check if third last char is a comma --> decimal
if i[-3] == ',':
i = i[:-3] + '.' + i[-3 + 1:]
if len(i) >= 4:
# check if fourth last char is a comma --> all commas are thousands
if i[-4] == ',':
i = i.replace(",", "")
strings_new.append(i)
strings_new = [float(i) for i in strings_new]
And the output looks fine:
strings_new = [0.0, 87390112.0, 78.0, 9.27, 1.0]
Are there any special cases this code is missing that I haven't thought of? And is there maybe a more efficient way to deal with this problem?
Upvotes: 0
Views: 1643
Reputation: 909
Two cases:
If the decimal is only one digit: 100,1
. Use i.rfind(',')
to get the comma position instead of hard-coding it with -3
.
If it contains both conditions: 10,000,24
. It will slip through the second if-clause (10,000.24
). Always running i.replace(",", "")
without the if-clause should be fine.
Upvotes: 1