Reputation: 887
I have this df:
ID Date Time Lat Lon Time_1 Lat_1 Lon_1
A 07/16/2019 08:00 29.39291 -98.50925 09:00 29.39923 -98.51256
A 07/16/2019 09:00 29.39923 -98.51256 10:00 29.40147 -98.51123
A 07/16/2019 10:00 29.40147 -98.51123 10:00 29.40147 -98.51123
A 07/18/2019 08:30 29.38752 -98.52372 09:30 29.39291 -98.50925
A 07/18/2019 09:30 29.39291 -98.50925 09:30 29.39291 -98.50925
B 07/16/2019 08:00 29.39537 -98.50402 08:00 29.39537 -98.50402
B 07/18/2019 11:00 29.39343 -98.49707 12:00 29.39291 -98.50925
B 07/18/2019 12:00 29.39291 -98.50925 12:00 29.39291 -98.50925
B 07/19/2019 10:00 29.39556 -98.53148 10:00 29.39556 -98.53148
I want to creat "Distance"
column by grouping the df by ID
and Date
, and to apply a defined function.
The code I wrote:
def grp_crossarc(f):
for i in range(len(f)):
f.loc[i,'Distance'] = crossarc(f.iloc[i]['Lat'],f.iloc[i]['Lon'],
f.iloc[i]['Lat_1'],f.iloc[i]['Lat_1'],
29.39537,-98.50402)
return f
df.groupby(['ID','Date'],as_index=False).apply(grp_crossarc)
crossarc
is another defined function that gets 6 arguments (3 lat-lon points).
The result I got:
ID Date Time Lat Lon Time_1 Lat_1 Lon_1 Distance
A 07/16/2019 08:00 29.39291 -98.50925 09:00 29.39923 -98.51256 0.166057
A 07/16/2019 09:00 29.39923 -98.51256 10:00 29.40147 -98.51123 0.889147
A 07/16/2019 10:00 29.40147 -98.51123 10:00 29.40147 -98.51123 0.973550
A 07/18/2019 08:30 29.38752 -98.52372 09:30 29.39291 -98.50925 NaN
A 07/18/2019 09:30 29.39291 -98.50925 09:30 29.39291 -98.50925 NaN
NaN NaN NaN NaN NaN NaN NaN NaN 0.736501
NaN NaN NaN NaN NaN NaN NaN NaN 0.165974
B 07/16/2019 08:00 29.39537 -98.50402 08:00 29.39537 -98.50402 NaN
NaN NaN NaN NaN NaN NaN NaN NaN 0.000000
B 07/18/2019 11:00 29.39343 -98.49707 12:00 29.39291 -98.50925 NaN
B 07/18/2019 12:00 29.39291 -98.50925 12:00 29.39291 -98.50925 NaN
NaN NaN NaN NaN NaN NaN NaN NaN 0.707027
NaN NaN NaN NaN NaN NaN NaN NaN 0.165974
B 07/19/2019 10:00 29.39556 -98.53148 10:00 29.39556 -98.53148 NaN
NaN NaN NaN NaN NaN NaN NaN NaN 1.900238
For few (ID, Date)
pairs, the Distance values shifted one row ahead, and therefore NaN values were created. How to fix it?
Upvotes: 2
Views: 106
Reputation: 862581
You can try lambda function instead loop:
def grp_crossarc(f):
f['Distance'] = (f.apply(lambda x: crossarc(x['Lat'],x['Lon'],
x['Lat_1'],x['Lat_1'],
29.39537,-98.50402), axis=1))
return f
df = df.groupby(['ID','Date'],as_index=False).apply(grp_crossarc)
But it seems function is not dependent of groups, so should be simplify with omit groupby.apply
:
df['Distance'] = (df.apply(lambda x: crossarc(x['Lat'],x['Lon'],
x['Lat_1'],x['Lat_1'],
29.39537,-98.50402), axis=1))
Upvotes: 1