Simd
Simd

Reputation: 21254

How to infer types in pandas dataframe

I have a dataframe which I read in using pyspark with:

df1 = spark.read.csv("/user/me/data/*").toPandas()

Unfortunately, pyspark leaves all the types as Object, even numerical values. I need to merge this with another dataframe I read in with df2 = pd.read_csv("file.csv") so I need the types in df1 to be inferred exactly as pandas would have done it.

How can you infer types of an existing pandas dataframe?

Upvotes: 3

Views: 3679

Answers (1)

piRSquared
piRSquared

Reputation: 294218

If you have the same column names you could use pd.DataFrame.astype:

df1 = df1.astype(df2.dtypes)

Otherwise, you need to construct a dictionary where keys are the column names in df1 and the values are dtypes. You can start with d = df2.dtypes.to_dict() to see what it should look like. Then construct a new dictionary altering the keys where needed.

Once you've constructed the dictionary d, use:

df1 = df1.astype(d)

Upvotes: 4

Related Questions