Tartaglia
Tartaglia

Reputation: 1041

Pandas Apply with multiple columns as input

For a dataframe which has 4 columns of coordinates (longitude, lattitude) I would like to create a 5th column which has the distance between both places for each column, below illustrates this:

 dict = [{'x1': '1','y1': '1','x2': '3','y2': '2'},
 {'x1': '1','y1': '1','x2': '3','y2': '2'}]
 data = pd.DataFrame(dict)

As an outcome I would like to have this:

dict1 = [{'x1': '1','y1': '1','x2': '3','y2': '2','distance': '2.6'},
{'x1': '1','y1': '1','x2': '3','y2': '2','distance': '2.9'}]   
data2 = pd.DataFrame(dict)

Where distance is computed using from geopy.distance import great_circle:

This is what I tried:

data['distance']=data[['x1','y1','x2','y2']].apply(lambda x1,y1,x2,y2: great_circle(x1,y1,x2,y2).miles, axis=1)

But that gives me a type error:

TypeError: () missing 3 required positional arguments: 'y1', 'x2', and 'y2'

Any help is appreciated.

Upvotes: 0

Views: 287

Answers (1)

Jimmy
Jimmy

Reputation: 246

That is because the lambda function can only view the operand data[['x1','y1','x2','y2']], so you should modify it as follow. Hope this helps!

data['distance']=data[['x1','y1','x2','y2']].apply(lambda df: great_circle(df['x1'],df['y1'],df['x2'],df['y2']).miles, axis=1)

Upvotes: 1

Related Questions