crackernutter
crackernutter

Reputation: 233

How to use apply in a Pandas dataframe slice to set values of multiple columns

I am trying to use apply function to assign new values to two existing columns in a dataframe slice using a .loc query.

To reproduce - first create a dataframe:

import re
import panads as pd
data = [[1000, "MSL", "Test string"], [2000, 'AGL', 'other string'], [0, 'AGL', "xxxx SFC-10000ft MSLXXX"]]
df = pd.DataFrame(data=data, columns=['Alt', "AltType",'FreeText'])

Then create the apply function

def testapply(row):
    try:
        match = re.findall("SFC-([0-9]+)FT (MSL|AGL|FL)", row.FreeText)[0]
        return (int(match[0]), match[1])
    except:
        return (0, row.AltType)

When I run

df.loc[df['Alt']==0, ['Alt', 'AltType']] = df.loc[df['Alt']==0].apply(testapply, axis=1)

I would like to get as a result is:

     Alt AltType                 FreeText
0   1000     MSL              Test string
1   2000     AGL             other string
2  10000     MSL  xxxx SFC-10000ft MSLXXX

but what I end up getting is:

            Alt       AltType                 FreeText
0          1000           MSL              Test string
1          2000           AGL             other string
2  (10000, MSL)  (10000, MSL)  xxxx SFC-10000FT MSLXXX

Does anyone know how to make this work in one fell swoop?

Upvotes: 2

Views: 2383

Answers (4)

Mehdi Golzadeh
Mehdi Golzadeh

Reputation: 2583

Only change your testapply function to this:

def testapply(row):
    try:
        match = re.findall("SFC-([0-9]+)ft (MSL|AGL|FL)", row.FreeText)[0]
        print(match)
        return (int(match[0]), match[1]).tolist()
    except:
        return (0, row.AltType)

Upvotes: 0

Shubham Sharma
Shubham Sharma

Reputation: 71689

Let's try Series.str.extract and use boolean indexing with loc to replace the values in columns Alt and AltType where the column Alt contains 0:

m = df['Alt'].eq(0)
df.loc[m, ['Alt', 'AltType']] = df.loc[m, 'FreeText'].str.extract(r'(?i)SFC-(\d+)FT\s(MSL|AGL|FL)').values

     Alt AltType                 FreeText
0   1000     MSL              Test string
1   2000     AGL             other string
2  10000     MSL  xxxx SFC-10000ft MSLXXX

Upvotes: 1

wwnde
wwnde

Reputation: 26676

Use loc accessor to select relevant Alt column. Compute value by extracting digit from matching FreeText using regex

df.loc[df['Alt']==0,'Alt']=df.loc[df['Alt']==0,'FreeText'].str.extract('(\d+)')[0]



     Alt AltType                 FreeText
0   1000     MSL              Test string
1   2000     AGL             other string
2  10000     AGL  xxxx SFC-10000ft MSLXXX

Upvotes: 1

It_is_Chris
It_is_Chris

Reputation: 14083

Just add tolist()

df.loc[df['Alt']==0, ['Alt', 'AltType']] = df.loc[df['Alt']==0].apply(testapply, axis=1).tolist()

Upvotes: 2

Related Questions