LeoGER
LeoGER

Reputation: 357

Check if column consists of numbers in string data type

Below is script for a simplified version of the df in question:

import pandas as pd

df = pd.DataFrame({ 
               'feature'       : ['cd_player', 'sat_nav', 'sub_woofer', 'usb_port','cd_player', 'sat_nav', 'sub_woofer', 'usb_port','cd_player', 'sat_nav', 'sub_woofer', 'usb_port'],
               'feature_value' : ['1','1','0','4','1','0','0','1','1','1','1','0'],
               'feature_colour' : ['red','orange','yellow','green','blue','indigo','violet','red','orange','yellow','green','blue']
                 })
df
    feature     feature_value   feature_colour
0   cd_player   1               red
1   sat_nav     1               orange
2   sub_woofer  0               yellow
3   usb_port    4               green
4   cd_player   1               blue
5   sat_nav     0               indigo
6   sub_woofer  0               violet
7   usb_port    1               red
8   cd_player   1               orange
9   sat_nav     1               yellow
10  sub_woofer  1               green
11  usb_port    0               blue

df.dtypes

feature          object
feature_value    object
dtype: object

I want to find a way to find all columns with numerical values, and convert their datatypes to integers and/or floats. Of course in this example, it is easy to do manually, however the DF in question has ~50 potential cols with numerical values, but as they are all have object dtypes, it would be rather inefficient to determine manually.

INTENDED OUTPUT:

df.dtypes

feature          object
feature_value     int64
dtype: object

Any help would be greatly appreciated.

Upvotes: 1

Views: 695

Answers (1)

Reza
Reza

Reputation: 2025

Try this:

df = df.apply(lambda x: pd.to_numeric(x, errors='ignore'))
df.dtypes

Upvotes: 4

Related Questions