Reputation: 458
I am trying to convert a column from string to float. The df column looks something like below. Some rows have no numbers in them, but are a space ' '.
col
'1.1, 1.0006'
' '
I am trying to round each number to the third decimal place. The output would look something like this.
col
'1.100, 1.001'
' '
My thinking:
df['col'] = df['col'].astype(float)
df['col'] = df['col'].round(3)
Upvotes: 1
Views: 273
Reputation: 2472
Well you can try this:
df['col'] = df['col'].apply(lambda x: x.split(', '))
def string_to_float(list):
x = []
for each in list:
x.append(round(float(each), 3))
return x
df['col'] = df['col'].apply(lambda x: string_to_float(x))
UPDATE: The following code will work perfectly now:
df['col'] = df['col'].apply(lambda x: x.replace("'", "").replace(" ", "").split(','))
def string_to_float(list):
x = []
for each in list:
if each != '':
x.append((str(round(float(each), 3))))
return ','.join(x)
df['col'] = df['col'].apply(lambda x: string_to_float(x))
Upvotes: 1
Reputation: 18647
Try:
def fix_string(string):
numbers = pd.to_numeric(string.split(','), errors='coerce').round(3)
return numbers
df['col'] = df['col'].apply(fix_string)
Upvotes: 1
Reputation: 862731
I think you need:
df = pd.DataFrame({'col':["'1.1, 1.0006'", "' '"]})
print (df)
def func(x):
out = []
#split and strip values, try convert each value to float, if error, get original value
for y in x.strip("'").split(', '):
try:
out.append(round(float(y), 3))
except:
out.append(y)
return (out)
df['new'] = df['col'].apply(func)
print (df)
col new
0 '1.1, 1.0006' [1.1, 1.001]
1 ' ' [ ]
If need strings from floats use f-strings
:
def func(x):
out = []
for y in x.strip("'").split(', '):
try:
out.append(f'{round(float(y), 3):.3f}')
except:
out.append(y)
return (out)
df['new'] = df['col'].apply(func)
print (df)
col new
0 '1.1, 1.0006' [1.100, 1.001]
1 ' ' [ ]
And for strings add join
to end:
def func(x):
out = []
for y in x.strip("'").split(', '):
try:
out.append(f'{round(float(y), 3):.3f}')
except:
out.append(y)
return (', '.join(out))
df['new'] = df['col'].apply(func)
print (df)
col new
0 '1.1, 1.0006' 1.100, 1.001
1 ' '
Upvotes: 2