user5718928
user5718928

Reputation:

Python Numpy can't convert Strings to Integers from CSV file

I read a CSV File and everythings works except the conversion of the values to integers, since all the values there are strings. I tried to convert column-wise in a loop like this:

counter = 0
while counter < len(data):
    try:
        data[counter,0] = data[counter,0].astype(int) # ID
        data[counter,1] = data[counter,1].astype(int) # Survived
    except ValueError:
        pass
    counter = counter + 1

As you can see it is the titanic dataset I try to work with.

print (type(data[0,0]))

And printing the type of a value gives me <class 'numpy.str_'>

How do I properly convert the columns to integers? Thanks in advance!

Upvotes: 1

Views: 501

Answers (2)

Oliver Dain
Oliver Dain

Reputation: 9953

The problem is you're trying to change 1 item at a time without changing the dtype of data. Note that data.dtype tells you the type of the ndarray and you can't change that one cell at a time - the entire ndarray has a single type. Try this instead: data = data.astype(int). That will convert all rows and all columns to integers at once.

Upvotes: 0

user5718928
user5718928

Reputation:

Ok I found out that pandas is converting all the datatypes automatically with following code:

data = pandas.read_csv("filename.csv")

Upvotes: 1

Related Questions