PineNuts0
PineNuts0

Reputation: 5234

Python Dataframe: Create function that makes all values in one column uppercase

I have two python data frames. I want to make all the values in one column of both data frames uppercase.

The following code works:

df_ERA4['reqmnt'] = df_ERA4['reqmnt'].str.upper()
df_ERA5['reqmnt'] = df_ERA5['reqmnt'].str.upper()

But when I want to do the same thing in a function it doesn't work:

def uppercase(df):
    df['reqmnt'] = df['reqmnt'].str.upper()

df_ERA4 = uppercase(df_ERA4)
df_ERA5 = uppercase(df_ERA5)

df_ERA4.head()

Specifically, when I run the above code it gives me the following error: AttributeError: 'NoneType' object has no attribute 'head'

Upvotes: 1

Views: 223

Answers (1)

cs95
cs95

Reputation: 402493

Your function, by default does not return anything. So, by default, it returns None. When calling your function, the correct thing to do would be to not assign the return value to anything, since changes are being made in-place.

There are a couple of options now. The first one being: don't return anything, and don't assign anything.

def upper(df, col):
    df[col] = df[col].str.upper()

upper(df, 'reqmnt')

However, this may not be the best approach (personally, I don't much fancy functions that perform in-place operations). You could, alternatively, have a copy returned via an assign call.

def upper(df, col):
    return df.assign(**{col : df[col].str.upper()})       

df = upper(df, 'reqmnt')

Note that there's a caveat on this side to - assign returns a copy, and sometimes, where efficiency/performance are paramount, you don't want to be needlessly making copies of GBs of data. What to use should be decided by a combination of style and need.

Upvotes: 3

Related Questions