Reputation: 1041
For a dataframe which has 4 columns of coordinates (longitude, lattitude) I would like to create a 5th column which has the distance between both places for each column, below illustrates this:
dict = [{'x1': '1','y1': '1','x2': '3','y2': '2'},
{'x1': '1','y1': '1','x2': '3','y2': '2'}]
data = pd.DataFrame(dict)
As an outcome I would like to have this:
dict1 = [{'x1': '1','y1': '1','x2': '3','y2': '2','distance': '2.6'},
{'x1': '1','y1': '1','x2': '3','y2': '2','distance': '2.9'}]
data2 = pd.DataFrame(dict)
Where distance is computed using from geopy.distance import great_circle:
This is what I tried:
data['distance']=data[['x1','y1','x2','y2']].apply(lambda x1,y1,x2,y2: great_circle(x1,y1,x2,y2).miles, axis=1)
But that gives me a type error:
TypeError: () missing 3 required positional arguments: 'y1', 'x2', and 'y2'
Any help is appreciated.
Upvotes: 0
Views: 287
Reputation: 246
That is because the lambda function can only view the operand data[['x1','y1','x2','y2']]
, so you should modify it as follow. Hope this helps!
data['distance']=data[['x1','y1','x2','y2']].apply(lambda df: great_circle(df['x1'],df['y1'],df['x2'],df['y2']).miles, axis=1)
Upvotes: 1