Marx Babu
Marx Babu

Reputation: 760

python deprecated pd.convert_objects(convert_numeric=True) works the alternative one malfunctions

The following warning is thrown but i am getting the actual intended result ,when i try to change the code it fails.

new_table = new_table.convert_objects(convert_numeric=True)
new_table=new_table.replace(np.nan,0)  # This is used to make - to 0 for calc

Warning (from warnings module): new_table = new_table.convert_objects(convert_numeric=True) FutureWarning: convert_objects is deprecated. Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.

the new_table is nothing but a pandas dataframe it contains

A    B   C  D  E
1    -   3  5  6
2    3   5  6  7
-    -   5  5  5
5    4   -  -  -
-    -   4  -  4
9    -   -  10 23

In this given data frame format since we have the string "-" further sum or diff or multiplication logics throws error if i use the below method .

new_table = pd.to_numeric(new_table)
#new_table=new_table.replace("-",0)
new_table=new_table.replace(np.nan,0)

Traceback (most recent call last): File line 107, in new_table = pd.to_numeric(new_table) File line 113, in to_numeric raise TypeError('arg must be a list, tuple, 1-d array, or Series') TypeError: arg must be a list, tuple, 1-d array, or Series

What is the best way of handling this situation the first row shall be index in str format and others rows are numeric so that my arithmatic calculation will not be affected.

Any help ?

Upvotes: 3

Views: 1761

Answers (1)

jezrael
jezrael

Reputation: 862751

You can if need replace all non numeric values to NaNs use apply for working with columns in df with function to_numeric, then to 0 by fillna and last all values to ints by astype:

new_table1 = new_table.apply(pd.to_numeric, errors='coerce').fillna(0).astype(int)
print (new_table1)
   A  B  C   D   E
0  1  0  3   5   6
1  2  3  5   6   7
2  0  0  5   5   5
3  5  4  0   0   0
4  0  0  4   0   4
5  9  0  0  10  23

print (new_table1.dtypes)
A    int32
B    int32
C    int32
D    int32
E    int32
dtype: object

Anoter solution if all values are integers is replace all non numbers + astype:

new_table2 = new_table.replace('\D+', 0, regex=True).astype(int)
print (new_table2)
   A  B  C   D   E
0  1  0  3   5   6
1  2  3  5   6   7
2  0  0  5   5   5
3  5  4  0   0   0
4  0  0  4   0   4
5  9  0  0  10  23

print (new_table2.dtypes)
A    int32
B    int32
C    int32
D    int32
E    int32
dtype: object

And if all values are only - then solution is simplifying:

new_table3 = new_table.replace('-', 0, regex=True).astype(int)
print (new_table3)
   A  B  C   D   E
0  1  0  3   5   6
1  2  3  5   6   7
2  0  0  5   5   5
3  5  4  0   0   0
4  0  0  4   0   4
5  9  0  0  10  23

print (new_table3.dtypes)
A    int32
B    int32
C    int32
D    int32
E    int32
dtype: object

Upvotes: 4

Related Questions