Python- defining a function that takes data-frame as an Input

I am trying to create a function that alters the datatypes of the columns.

def ChangeDatatypes(df):
    df['Name'] = df['Name'] .astype('category')
    df['Character'] = df['Character'] .astype('category')
    cat_columns = df.select_dtypes(['category']).columns
    df[cat_columns] = df[cat_columns].apply(lambda x: x.cat.codes)
    return(df)

Actual_Dataframe = ChangeDatatypes(Actual_DataFrame) ' calling that function

I want to change the Actual_Dataframe and therefore calling the function that I created but it's not throwing any error and at the same time, it is not changing the data types of the columns.

Where am I going wrong in the code?

Upvotes: 1

Views: 6291

Answers (1)

jpp
jpp

Reputation: 164783

You need not use pd.DataFrame.apply here.

Instead, you can access the cat.codes property of a categorical column directly. In addition, you can use pd.DataFrame.pipe to run ("pipe") your dataframe through a function.

Below is a verified example.

def ChangeDatatypes(df):
    df['Name'] = df['Name'].astype('category').cat.codes
    df['Character'] = df['Character'].astype('category').cat.codes
    return df

df = pd.DataFrame({'Name': ['a', 'b', 'c', 'd'],
                   'Character': ['e', 'f', 'g', 'h']})

df = df.pipe(ChangeDatatypes)

print(df)

#    Character  Name
# 0          0     0
# 1          1     1
# 2          2     2
# 3          3     3

print(df.dtypes)

# Character    int8
# Name         int8
# dtype: object

Upvotes: 2

Related Questions