Reputation: 129
I am new to python and would like your help if possible. I have a .csv file that contains multiple rows. In one column I have Country, other column I have id and in other I have latitude and longitude. I would like to combine into a new data frame unique values of country and latitude and longitude with all ids. To make it easier, I have this input df example:
Country id longitude latitude
Angola Pable 17.470 -12.245
Angola Juan 17.470 -12.245
Albania Dimitri 20.032 41.141
Albania Dinko 20.032 41.141
United States John -112.599 45.705
United States Paul -112.599 45.705
United States David -112.599 45.705
I have tried:
df1 = df.groupby('Country').apply(lambda x: ','.join(x.id))
But it is not working.
The output I'm looking for is:
Country id longitude latitude
Angola Pable, Juan 17.470 -12.245
Albania Dimitri, Dinko 20.032 41.141
United States John, Paul, David -112.599 45.705
I expected this output as a pandas data frame, which I will be using to plot a map using plotly in python. Any ideas? Thank you in advance.
Upvotes: 3
Views: 44
Reputation: 195438
print(
df.groupby("Country")
.agg({"id": ", ".join, "longitude": "first", "latitude": "first"})
.reset_index()
)
Prints:
Country id longitude latitude
0 Albania Dimitri, Dinko 20.032 41.141
1 Angola Pable, Juan 17.470 -12.245
2 United States John, Paul, David -112.599 45.705
Upvotes: 2