Luca
Luca

Reputation: 10996

pandas astype key error on reverting column data type after merge

I have a dictionary of column names and data types as follows:

{'ds': dtype('<M8[ns]'), 'sensor_id': dtype('int64'), 'm1': dtype('int64'), 'm2': dtype('float64')}

Now, I have a merged pandas dataframe which also contain these columns as a result of right merge and change the data type because of NaN volumes.

So, I try and revert back to the original data type as follows:

merged = merged.apply(lambda x: x.astype(original[x.name]) if x.name in data_types else x)

Here original is the well, original data frame.

However, when I do this, I get the error:

KeyError: ('Only the Series name can be used for the key in Series dtype mappings.', 'occurred at index ds')

Upvotes: 0

Views: 992

Answers (1)

Steven
Steven

Reputation: 2133

Series.astype takes a dtype as input, but what you are passing, original[x.name], is a Series.

> df = pd.DataFrame(data={'A':range(5)})
> df['A'].astype(df['A']) ← What you are doing
KeyError: 'Only the Series name can be used for the key in Series dtype mappings.'

df['A'] is not a dtype. What you really want is the dtype of that series: df['A'].dtype. x in your lambda is the equivalent of df['A'] here, which is a Series and therefore so is original[x.name]. What you want is for the x column to get the dtype of your target column.

> df['A'].dtype
dtype('int64')

> df['A'].astype('int32')
0    0
1    1
2    2
3    3
4    4
Name: A, dtype: int32 ← new 'int32' dtype

> df['A'].astype(df['A'].dtype) ← Use the dtype of the A column
0    0
1    1
2    2
3    3
4    4
Name: A, dtype: int64

Therefore, what you need is:

merged = merged.apply(lambda x: x.astype(original[x.name].dtype) if x.name in data_types else x)

Passing a column (Series) from another Dataframe isn't the issue, per your comment. You can't pass a column at all.

Upvotes: 1

Related Questions