Reputation: 67
I have a data Frame with about 50,000 records; and I noticed that ".0" have been added behind all numbers in a column. I have been trying to remove the ".0", so that the table below;
N | Movies
1 | Save the Last Dance
2 | Love and Other Drugs
3 | Dance with Me
4 | Love Actually
5 | High School Musical
6 | 2012.0 <-----
7 | Iron Man
8 | 300.0 <-----
9 | Inception
10 | 360.0 <-----
11 | Pulp Fiction
Will look like this;
N | Movies
1 | Save the Last Dance
2 | Love and Other Drugs
3 | Dance with Me
4 | Love Actually
5 | High School Musical
6 | 2012 <-----
7 | Iron Man
8 | 300 <-----
9 | Inception
10 | 360 <-----
11 | Pulp Fiction
The challenge is that the column contains both numbers and strings.
Is this possible, if yes, how?
Thanks in advance.
Upvotes: 4
Views: 9396
Reputation: 604
Python 2.7.2+ (default, Jul 20 2012, 22:15:08)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> str1 = "300.0"
>>> str(int(float(str1)))
'300'
>>>
Upvotes: 0
Reputation: 393963
Use a function and apply to whole column:
In [94]:
df = pd.DataFrame({'Movies':['Save the last dance', '2012.0']})
df
Out[94]:
Movies
0 Save the last dance
1 2012.0
[2 rows x 1 columns]
In [95]:
def trim_fraction(text):
if '.0' in text:
return text[:text.rfind('.0')]
return text
df.Movies = df.Movies.apply(trim_fraction)
In [96]:
df
Out[96]:
Movies
0 Save the last dance
1 2012
[2 rows x 1 columns]
Upvotes: 4
Reputation: 8400
Here is hint for you ,
In case of Valid number ,
a="2012.0"
try:
a=float(a)
a=int(a)
print a
except:
print a
Output:
2012
In case of String like "Dance with Me"
a="Dance with Me"
try:
a=float(a)
a=int(a)
print a
except:
print a
Output:
Dance with Me
Upvotes: 0