create multiple columns in dataframe from function

Question

can someone help me create two new columns in this dataframe?

The desire is to parse the state out, "s" and then ensure that the state is removed from the original title string. The result would be to include the original title, the cleaned title (without the trailing State) and finally the state name.

df=pd.Series(['Accommodation Payroll Employment in Texas',
          'Accounting, Tax Preparation, Bookkeeping, and Payroll Services    Payroll Employment in Texas']).to_frame()
df.columns=['title']

def state_code(row):
    t=None
    s=None
    if len(row['title'].split(' in '))==2: 
        s=str(row['title'].split(' in ')[1])
        t=str(row['title'].split(' in ')[0])
    elif len(row['title'].split(' in '))==3:
        s=str(row['title'].split(' in ')[2])
        t=str(row['title'].split(' in ')[0]+row['title'].split(' in ')[1])
    elif len(row['title'].split(' for '))==2: 
        s=str(row['title'].split(' for ')[1])
        t=str(row['title'].split(' for ')[0])

    return t,s
df[['title_clean','state']]=df.apply(state_code,axis=1)

Ami Tavory · Accepted Answer

Instead of

return t, s

try

return pd.Series(dict(state=s, title_clean=t))

and instead of

df[['title_clean','state']]=df.apply(state_code,axis=1)

use

pd.concat([df, df.apply(state_code,axis=1)], axis=1)

Incidentally, your

t = None
s = None

seems redundant.

create multiple columns in dataframe from function

Answers (1)

Related Questions