Hannes
Hannes

Reputation: 23

How do I speed up this nested for loop in Python?

the function shown below is running quite slow even though I used swifter to call it. Does anyone know how to speed this up? My python knowledge is limited at this point and I would appreciate any help I could get. I tried using map() function but somehow it didnt work for me. I guess the nested for loop makes it rather slow, right?

BR, Hannes

def polyData(uniqueIds):
    for index in range(len(uniqueIds) - 1):
        element = uniqueIds[index]
        polyData1 = df[df['id'] == element]
        poly1 = build_poly(polyData1)
        poly1 = poly1.buffer(0)
        for secondIndex in range(index + 1, len(uniqueIds)):
            otherElement = uniqueIds[secondIndex]
            polyData2 = df[df['id'] == otherElement]
            poly2 = build_poly(polyData2)
            poly2 = poly2.buffer(0)
# Calculate overlap percentage wise
            overlap_pct = poly1.intersection(poly2).area/poly1.area
# Form new DF
            df_ol = pd.DataFrame({'id_1':[element],'id_2':[otherElement],'overlap_pct':[overlap_pct]})
# Write to SQL database
            df_ol.to_sql(name='df_overlap', con=e,if_exists='append',index=False)

Upvotes: 2

Views: 78

Answers (1)

user7661619
user7661619

Reputation:

This function is inherently slow for large amounts of data due to its complexity (trying every 2-combination of a set). However, you're calculating the 'poly' for the same ids multiple times, even though it seems that you can calculate them only once beforehand (which might be expensive) and store them for later usage. So try to extract the building of the polys.

def getPolyForUniqueId(uid):
    polyData = df[df['id'] == uid]
    poly = build_poly(polyData)
    poly = poly.buffer(0)
    return polyData

def polyData(uniqueIds):
    polyDataList = [getPolyForUniqueId(uid) for uid in uniqueIds]
    for index in range(len(uniqueIds) - 1):
        id_1 = uniqueIds[index]
        poly_1 = polyDataList[index]
        for secondIndex in range(index + 1, len(uniqueIds)):
            id_2 = uniqueIds[secondIndex]
            poly_2 = polyDataList[secondIndex]
            ...

Upvotes: 1

Related Questions