Reputation: 3041
I am trying to create a function that alters the datatypes of the columns.
def ChangeDatatypes(df):
df['Name'] = df['Name'] .astype('category')
df['Character'] = df['Character'] .astype('category')
cat_columns = df.select_dtypes(['category']).columns
df[cat_columns] = df[cat_columns].apply(lambda x: x.cat.codes)
return(df)
Actual_Dataframe = ChangeDatatypes(Actual_DataFrame)
' calling that function
I want to change the Actual_Dataframe and therefore calling the function that I created but it's not throwing any error and at the same time, it is not changing the data types of the columns.
Where am I going wrong in the code?
Upvotes: 1
Views: 6291
Reputation: 164783
You need not use pd.DataFrame.apply
here.
Instead, you can access the cat.codes
property of a categorical column directly. In addition, you can use pd.DataFrame.pipe
to run ("pipe") your dataframe through a function.
Below is a verified example.
def ChangeDatatypes(df):
df['Name'] = df['Name'].astype('category').cat.codes
df['Character'] = df['Character'].astype('category').cat.codes
return df
df = pd.DataFrame({'Name': ['a', 'b', 'c', 'd'],
'Character': ['e', 'f', 'g', 'h']})
df = df.pipe(ChangeDatatypes)
print(df)
# Character Name
# 0 0 0
# 1 1 1
# 2 2 2
# 3 3 3
print(df.dtypes)
# Character int8
# Name int8
# dtype: object
Upvotes: 2