Reputation: 1
How do I convert the data in a list to the correct type i.e int
if its whole number, float
if it's not a whole number, bool
if it is true or false?
def clean_data(data: List[list]) -> None:
"""Convert each string in data to an int if and only if it represents a
whole number, a float if and only if it represents a number that is not a
whole number, True if and only if it is 'True', False if and only if it is
'False', and None if and only if it is either 'null' or the empty string.
>>> d = [['abc', '123', '45.6', 'True', 'False']]
>>> clean_data(d)
>>> d
[['abc', 123, 45.6, True, False]]
Upvotes: 0
Views: 1024
Reputation: 2645
Try this:
import ast
def clean_data(l):
l1 = []
for l2 in l:
l3 = []
for e in l2:
try:
l3.append(ast.literal_eval(e))
except ValueError:
l3.append(e)
l1.append(l3)
return l1
Upvotes: 0
Reputation: 1744
Try out the ast
module from the standard library:
def clean_data(xs):
clean_xs = list()
for x in xs:
try:
converted_x = ast.literal_eval(x)
except ValueError:
converted_x = x
clean_xs.append(converted_x)
return clean_xs
This gives you
> clean_data(["1", "a", "True"])
[1, "a", True]
Upvotes: 1
Reputation: 41872
You could try a simplistic approach if that solves your problem:
def clean_data(data):
return [item == 'True' if item in ['True', 'False'] else \
int(item) if item.isdigit() else \
None if item in ['null', ''] else \
item if item.isalpha() else \
float(item) for item in data]
print(clean_data(['abc', '123', '45.6', 'True', 'False']))
OUTPUT
> python3 test.py
['abc', 123, 45.6, True, False]
>
Realistically, if you need something robust and extensible, I'd define a "recognizer" function for each type, except the default 'str', that either returns a converted result or the original string (or throws an error.) I would make a list of these functions, ordering them from most specific to least. (E.g. a boolean recognizer is very specific.) Then loop over the input, trying each recognizer function until one claims the input, use its value as a result and go on to the next input. If no recognizer claims the input, keep it as-is.
This way, if you have something new to convert, you simply define a new function that recognizes and converts it which you add to the recognizer function list in an appropriate position.
Upvotes: 1