Reputation: 3632
For the below dataframe I am using the code
df['%'] = ((df['Code Lines'] / df['Code Lines'].sum()) * 100).round(2).astype(str) + '%'
output
Language # of Files Blank Lines Comment Lines Code Lines %
C++ 15 66 35 354 6.13%
C/C++ Header 1 3 7 4 0.07%
Markdown 6 73 0 142 2.46%
Python 110 1998 2086 4982 86.27%
Tcl/Tk 1 14 18 273 4.73%
YAML 1 0 6 20 0.35%
I am trying to convert the str to float
df['%'] = df['% of Total (Code Only)'].astype('float64')
Getting error
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/dtypes/cast.py", line 730, in astype_nansafe return arr.astype(dtype, copy=True) ValueError: could not convert string to float: '0.35%'
is there a way to maintain the column % as float along with sign %
Upvotes: 3
Views: 10262
Reputation: 652
You can remove the last character from string as shown below:
str[:-1]
removes the last character
df['%'] = df['%'].str[:-1].astype('float64')
Or you can use replace() to replace %
with a blank character.
df['%'] = df['%'].replace("%","").astype('float64')
Upvotes: 0
Reputation: 51335
Another way, using strip
:
df['%'] = df['%'].str.strip('%').astype('float64')
0 6.13
1 0.07
2 2.46
3 86.27
4 4.73
5 0.35
Name: %, dtype: float64
Upvotes: 1
Reputation: 862661
Use str[:-1]
for remove last value (%
) by indexing with str:
df['%'] = df['%'].str[:-1].astype('float64')
But if possible better is:
df['%'] = ((df['Code Lines'] / df['Code Lines'].sum()) * 100).round(2)
print (df)
Language # of Files Blank Lines Comment Lines Code Lines %
0 C++ 15 66 35 354 6.13
1 C/C++ Header 1 3 7 4 0.07
2 Markdown 6 73 0 142 2.46
3 Python 110 1998 2086 4982 86.27
4 Tcl/Tk 1 14 18 273 4.73
5 YAML 1 0 6 20 0.35
Upvotes: 7