Reputation: 2016
I have a question.
I have a table like this
TAC | Latitude | Longitude
1 | 50.4 | -1.5
In Pandas, I wanted to say:
For each TAC, give me a zipped list of latitude and longitude (each TAC can have many rows).
I've tried things like the below, but I am doing something wrong! Can you help?
df1['coordinates'] = list(zip(df1.Lat, df1.Long))
new_df = df1.iloc[ : , : ].groupby('TAC').agg(df1['coordinates'])
For reference, DF1 is created as below
df = pd.read_csv('tacs.csv')
df1 = df[['magnet.tac','magnet.latitude', 'magnet.longitude']]
df1.columns = ['TAC','Lat','Long']
Upvotes: 1
Views: 47
Reputation: 862731
First add usecols
parameter for avoid SettingWithCopyWarning
and then use GroupBy.apply
with lambda function:
df = pd.read_csv('tacs.csv', usecols=['magnet.tac','magnet.latitude', 'magnet.longitude'])
df1.columns = ['TAC','Lat','Long']
#sample data
print (df1)
TAC Lat Long
0 1 50.4 -1.5
1 1 50.1 -1.4
2 2 50.2 -1.8
3 2 50.9 -1.3
new_df = df1.groupby('TAC').apply(lambda x: list(zip(x.Lat, x.Long))).reset_index(name='coord')
print (new_df)
TAC coord
0 1 [(50.4, -1.5), (50.1, -1.4)]
1 2 [(50.2, -1.8), (50.9, -1.3)]
Your solution should be changed:
df = pd.read_csv('tacs.csv')
df1 = df[['magnet.tac','magnet.latitude', 'magnet.longitude']].copy()
df1.columns = ['TAC','Lat','Long']
df1['coordinates'] = list(zip(df1.Lat, df1.Long))
new_df = df1.groupby('TAC')['coordinates'].agg(list).reset_index()
Upvotes: 2