Kyle
Kyle

Reputation: 387

create multiple columns in dataframe from function

can someone help me create two new columns in this dataframe?

The desire is to parse the state out, "s" and then ensure that the state is removed from the original title string. The result would be to include the original title, the cleaned title (without the trailing State) and finally the state name.

df=pd.Series(['Accommodation Payroll Employment in Texas',
          'Accounting, Tax Preparation, Bookkeeping, and Payroll Services    Payroll Employment in Texas']).to_frame()
df.columns=['title']

def state_code(row):
    t=None
    s=None
    if len(row['title'].split(' in '))==2: 
        s=str(row['title'].split(' in ')[1])
        t=str(row['title'].split(' in ')[0])
    elif len(row['title'].split(' in '))==3:
        s=str(row['title'].split(' in ')[2])
        t=str(row['title'].split(' in ')[0]+row['title'].split(' in ')[1])
    elif len(row['title'].split(' for '))==2: 
        s=str(row['title'].split(' for ')[1])
        t=str(row['title'].split(' for ')[0])

    return t,s
df[['title_clean','state']]=df.apply(state_code,axis=1)

Upvotes: 1

Views: 70

Answers (1)

Ami Tavory
Ami Tavory

Reputation: 76297

Instead of

return t, s

try

return pd.Series(dict(state=s, title_clean=t))

and instead of

df[['title_clean','state']]=df.apply(state_code,axis=1)

use

pd.concat([df, df.apply(state_code,axis=1)], axis=1)

Incidentally, your

t = None
s = None

seems redundant.

Upvotes: 2

Related Questions