Liza
Liza

Reputation: 971

how to loop over pandas dataframe?

I have a python function which works on sequence of coordinates (trajectory data). It requires data to be in the following format.

#items = [Item(x1, y1), Item(x2, y2), Item(x3, y3), Item(x4, y4)]
items = [Item(0.5, 0.5), Item(-0.5, 0.5), Item(-0.5, -0.5), Item(0.5, -0.5)]

It is also required to find the xmin, ymin, xmax, ymax from the above items and specify it for a bounding box as below.

 spindex = pyqtree.Index(bbox=[-1, -1, 1, 1])
                        #bbox = [xmin,ymin,xmax,ymax]

Now, the items are inserted as below.

 #Inserting items
 for item in items:
     spindex.insert(item, item.bbox)

As we can see for now all the above operations are performed on a single sequence of coordinates specified in items. I require to perform the above steps on a data frame with multiple trajectories, each having multiple sequence of points and identified by an id vid.

The sample df is as follows:

   vid       x         y
0  1         2         3
1  1         3         4
2  1         5         6
3  2         7         8 
4  2         9        10
5  3         11       12
6  3         13       14
7  3         15       16
8  3         17       18

In the above data frame, x, y are the coordinate data and all the points belonging to the same “vid" forms one separate trajectory; so it can be observed rows(0-2) belonging to voyage id (vid) = 1 is one trajectory, while points belonging to vid = 2 is another trajectory and so on.

The above data can be transformed as the following df too (only if required):

    vid        (x,y)
0   1          [ (2,3),(3,4), (5,6) ]
1   2          [ (7,8),(9,10) ]
2   3          [ (11,12),(13,14),(15,16),(17,18) ]

I want to create a way to loop over the df and maybe groupby them with vid and get all the coordinates as items and find xmin,xmax,ymin,ymax and insert them as shown above for each of the trajectories in the df.

I have a code something like this, but it doesn't works

for group in df.groupby('vid'):
bbox = [ group['x'].min(), group['y'].min(), group['x'].max(), group['y'].max() ]
spindex.insert(group['vid'][0], bbox)

Please Help.

Upvotes: 0

Views: 299

Answers (1)

CK Chen
CK Chen

Reputation: 664

Gourpby return ((gkeys), grouped_dataframe)
Modify your codes to following:

for g in df.groupby('vid'):
   vid = g[0]
   g_df = g[1]
   bbox = [ g_df['x'].min(), g_df['y'].min(), g_df['x'].max(), g_df['y'].max() ]
   spindex.insert(vid, bbox)

Upvotes: 1

Related Questions