Reputation: 39
Hear is my original dataframe columns type:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 NAME 23605 non-null object
1 DEPARTMENT_NAME 23605 non-null object
2 TITLE 23605 non-null object
3 REGULAR 21939 non-null object
4 RETRO 13643 non-null object
5 OTHER 13351 non-null object
6 OVERTIME 6826 non-null object
7 INJURED 1312 non-null object
8 DETAIL 2355 non-null object
9 QUINN/EDUCATION INCENTIVE 1351 non-null object
10 TOTAL EARNINGS 23605 non-null object
11 POSTAL 23605 non-null object
I want to convert some of them into float type, say Total earnings, I tried:
df['TOTAL EARNINGS'] = df['TOTAL EARNINGS'].astype(int)
and
df['TOTAL EARNINGS'] = pd.to_numeric(df['TOTAL EARNINGS'])
But I got:
ValueError: setting an array element with a sequence.
or
TypeError: Invalid object type at position 0
And I don't know why, is there any other methods to do so? Here is my data: https://data.boston.gov/dataset/418983dc-7cae-42bb-88e4-d56f5adcf869/resource/31358fd1-849a-48e0-8285-e813f6efbdf1/download/employeeearningscy18full.csv
Here are some pictures of my dataframe: enter image description here enter image description here enter image description here
Upvotes: 1
Views: 602
Reputation: 4475
This happens because your original data has 2 rows which are completely text.
First execute command below to clean those rows.
df = df[df["TOTAL EARNINGS"]!="TOTAL EARNINGS"]
Then, change the datatype
df['TOTAL EARNINGS'] = df['TOTAL EARNINGS'].astype(float)
You can check datatypes thereafter as
df.dtypes
Upvotes: 1