Reputation: 233
I am trying to use apply function to assign new values to two existing columns in a dataframe slice using a .loc query.
To reproduce - first create a dataframe:
import re
import panads as pd
data = [[1000, "MSL", "Test string"], [2000, 'AGL', 'other string'], [0, 'AGL', "xxxx SFC-10000ft MSLXXX"]]
df = pd.DataFrame(data=data, columns=['Alt', "AltType",'FreeText'])
Then create the apply function
def testapply(row):
try:
match = re.findall("SFC-([0-9]+)FT (MSL|AGL|FL)", row.FreeText)[0]
return (int(match[0]), match[1])
except:
return (0, row.AltType)
When I run
df.loc[df['Alt']==0, ['Alt', 'AltType']] = df.loc[df['Alt']==0].apply(testapply, axis=1)
I would like to get as a result is:
Alt AltType FreeText
0 1000 MSL Test string
1 2000 AGL other string
2 10000 MSL xxxx SFC-10000ft MSLXXX
but what I end up getting is:
Alt AltType FreeText
0 1000 MSL Test string
1 2000 AGL other string
2 (10000, MSL) (10000, MSL) xxxx SFC-10000FT MSLXXX
Does anyone know how to make this work in one fell swoop?
Upvotes: 2
Views: 2383
Reputation: 2583
Only change your testapply function to this:
def testapply(row):
try:
match = re.findall("SFC-([0-9]+)ft (MSL|AGL|FL)", row.FreeText)[0]
print(match)
return (int(match[0]), match[1]).tolist()
except:
return (0, row.AltType)
Upvotes: 0
Reputation: 71689
Let's try Series.str.extract
and use boolean indexing with loc
to replace the values in columns Alt
and AltType
where the column Alt
contains 0
:
m = df['Alt'].eq(0)
df.loc[m, ['Alt', 'AltType']] = df.loc[m, 'FreeText'].str.extract(r'(?i)SFC-(\d+)FT\s(MSL|AGL|FL)').values
Alt AltType FreeText
0 1000 MSL Test string
1 2000 AGL other string
2 10000 MSL xxxx SFC-10000ft MSLXXX
Upvotes: 1
Reputation: 26676
Use loc accessor to select relevant Alt
column. Compute value by extracting digit
from matching FreeText
using regex
df.loc[df['Alt']==0,'Alt']=df.loc[df['Alt']==0,'FreeText'].str.extract('(\d+)')[0]
Alt AltType FreeText
0 1000 MSL Test string
1 2000 AGL other string
2 10000 AGL xxxx SFC-10000ft MSLXXX
Upvotes: 1
Reputation: 14083
Just add tolist()
df.loc[df['Alt']==0, ['Alt', 'AltType']] = df.loc[df['Alt']==0].apply(testapply, axis=1).tolist()
Upvotes: 2