Loop over data frame and remove row

Question

I want to detect rectangle collision in 2D plan (picture). I Can have many rectangle in my picture.

I stored rectangle coordinate in a data frame that look like this:

Each line correspond to a rectangle. proba is the result of my machine learning model

I want to loop over each line, check if a rectangle has shared coordinate with another one. IF the answer is yes, I want to check the probability of both rectangle, and delete the one with lowest probability

I already have function for collision detection (not sure it's 100% working yet, i'll check it later)

def collision(x1,y1,w1,h1,x2,y2,w2,h2):
    if (x1<= x2+w2 and x1+w1>= x2 and y1 <= y2+h2 and  y1+h1 >= y2):
        return True
    else:
        return False

Now, how to loop over my data frame and remove collided rectangle with lowest probability?

Thanks for help

Edit: for above example, pseudo code would like this

Collision(line1,line2) > result = True > remove line 1

Now we have only 3 line 


Collision(line1,line2) > result = False > Do Nothing

Collision(line1,line3) > result = True> remove line3

EDIT2: Adding reproducible example

#create data frame
fake_data=pd.DataFrame()
fake_data["left"]=(0.04,0.1,0.4,0.3)
fake_data["top"]=(0.31,0.13,0.34,0.28)
fake_data["width"]=(0.82,0.7,0.82,0.84)
fake_data["height"]=(0.57,0.2,0.59,0.55)
fake_data["proba"]=(0.60,0.62,0.34,0.39)

#define function
def collision(x1,y1,w1,h1,x2,y2,w2,h2):
    if (x1<= x2+w2 and x1+w1>= x2 and y1 <= y2+h2 and  y1+h1 >= y2):
        return True
    else:
        return False

#example how to use function

collision(fake_data.iloc[3,0],fake_data.iloc[3,1],fake_data.iloc[3,2],fake_data.iloc[3,3],fake_data.iloc[1,0],fake_data.iloc[1,1],fake_data.iloc[1,2],fake_data.iloc[1,3])

emilaz · Accepted Answer

Your question is not well-defined. Consider the following:

Rectangle 1 collides with Rectangle 2, not with Rectangle 3.

Rectangle 2 collides with Rectangle 3.

Their probabilities are Rectangle 1 > Rectangle 2 > Rectangle 3

Since Rectangle 2's probability is higher than Rectangle 3's, do you delete Rectangle 3 although once you delete Rectangle 2 because of collision with Rectangle 1, Rectangle 3 would be collision-free? By simply iterating over the dataframe, you might end up deleting way more rectangles than actually necessary.

If you don't care about that fact, the script below will obtain your desired results quickly:

def func(coordinates):  # function to find collisions with all other rows for a given row
    collisions = fake_data.apply(lambda row: (collision(*row[:-1], *coordinates[:-1]) == True), axis=1)  # this finds collisions
    collisions_w_lower_prob = [i for i, x in enumerate(collisions) if fake_data.iloc[i]['proba']< coordinates['proba']]  # this saves those collision rows that had a lower probability
    return collisions_w_lower_prob

to_delete = fake_data.apply(lambda row: func(row), axis=1).values  # apply said function to your dataframe so all rows are compared to all other rows
to_delete_unique = set([y for z in to_delete for y in z]) #unique indices to delete
fake_data.drop(to_delete_unique)
>>> left    top   width   height  proba
    0.1    0.13    0.7     0.2    0.62

If you do care, then you will need to delete iteratively, starting with the rectangle with highest probability:

fake_data = fake_data.sort_values('proba', ascending=False)  # we want to start with higher probabilities first, as they are guaranteed to stay.

idx = 0
while True:
    curr_row = fake_data.iloc[0]
    to_delete = func(curr_row)  # get deletion candidates
    fake_data.drop(to_delete, inplace=True)  # drop them from your dataframe
    idx += 1
    if idx > len(fake_data):
        break

fake_data
>>> left    top   width   height  proba
    0.1    0.13    0.7     0.2    0.62

Note that in your example both ways produce the same result, but that this might not always be the case for the explained reasons.

Loop over data frame and remove row

Answers (2)

Related Questions