qarabala
qarabala

Reputation: 178

Ignore dtype exceptions when importing csv in Python

Suppose I have a natural-valued variable, e.g. "age" in my csv-Dataset. This dataset is flowed, since some of the values are strings, e.g. "missing".

This code

personal_info = pd.read_csv("Age.csv", sep=',')

gives me the error

DtypeWarning: Columns (6,10) have mixed types. Specify dtype option on import or set low_memory=False.

Adding dtype

personal_info = pd.read_csv("Age.csv", sep=',', error_bad_lines=False,
                               dtype={'age': int})

blows up when encountering the string "missing".

invalid literal for int() with base 10: 'missing'

How do I ignore the rows with the values not in the variable domain?

Upvotes: 1

Views: 258

Answers (1)

fmarm
fmarm

Reputation: 4284

You can use na_values argument :

personal_info = pd.read_csv("Age.csv", sep=',', error_bad_lines=False,
                           dtype={'age': int},na_values=['missing'])

Upvotes: 2

Related Questions