Group by fields in pandas dataframe

Question

I have a dataframe with the following fields. For each Id, I have two records, that represent different latitude and longitudes. I'm trying to achieve a resultant dataframe that groups by current dataframe based on id and put its latitude and longitude into different fields.

I tried with the group by function but I do not get the intended results. Any help would be greatly appreciated.

Id  StartTime   StopTime    Latitude    Longitude
101 14:42:28    14:47:56    53.51       118.12
101 22:10:01    22:12:49    33.32       333.11

Result:

Id  StartLat    StartLong   DestLat DestLong
101 53.51       118.12      33.32       333.11

jezrael · Accepted Answer

You can use groupby with apply function for return flatten DataFrame to Series:

df = df.groupby('Id')['Latitude','Longitude'].apply(lambda x: pd.Series(x.values.ravel()))
df.columns = ['StartLat', 'StartLong', 'DestLat', 'DestLong']
df = df.reset_index()
print (df)
    Id  StartLat  StartLong  DestLat  DestLong
0  101     53.51     118.12    33.32    333.11

If problem:

TypeError: Series.name must be a hashable type

try change Series to DataFrame, but then need unstack with droplevel:

df = df.groupby('Id')['Latitude','Longitude']
       .apply(lambda x: pd.DataFrame(x.values.ravel()))
       .unstack()
df.columns = df.columns.droplevel(0)
df.columns = ['StartLat', 'StartLong', 'DestLat', 'DestLong']
df = df.reset_index()
print (df)
    Id  StartLat  StartLong  DestLat  DestLong
0  101     53.51     118.12    33.32    333.11

Group by fields in pandas dataframe

Answers (1)

Related Questions