user3079850
user3079850

Reputation: 21

Python Pandas: Create lists from a Dataframe that shows the indices of values that match criteria

Apologies, I'm new to python Pandas and therefore it might be possible that I'm not thinking in the correct way.

The following dataframe shows the distances calculated between every value of 2 lists. Let's suppose the two lists contain 100 values, then the Dataframe consists out of 100*100 distances calculated.

The Dataframe looks as following:

       Columns    Place A      ...               Place Z
   Rows                        ...                         
    Place 1       1203.899830  ...               612.069576
    39             496.571302  ...               173.808444
    46            1061.913143  ...               488.477587
    48             978.166934  ...               412.830790
    57            1008.997612  ...               595.226911
    ...                   ...  ...                      ...
    20675         1705.383726  ...              1233.132858
    20676         1562.641365  ...               962.671417
    20678          433.199056  ...               451.470780
    20679          112.586538  ...               720.512590
    Place 20680    53.794611  ...               681.927796
    
    [4068 rows x 179 columns]

I would like to get every place (1 till 20680) that is less than 100 km listed under Place A till Z. So the 100 km filter has to be applied individually for every column.

For example, Place 1 is 1200 km from Place A so it shouldn't appear in the 'Place 1' column. Place 39 is located just 30 km of Place X, so Place 39 should appear in the 'Place X' column.

In other words, the indices of the values < 100 km have to be listed in the different columns. I understand that the columns will become of different lengths in this way.

Till so far, I got it done for just one column but I would like to achieve it for every column:

Input:  Index_label = df_dist_matrix[df_dist_matrix['Place A']< 100].index.tolist()

print(Index_label)

Output:
[175, 252, 379, 398, 455, 486, 523,....., Place 20680]

Some help would be greatly appreciated, thanks!

Upvotes: 0

Views: 106

Answers (1)

Rob Raymond
Rob Raymond

Reputation: 31156

I took 200km as your sample data is not so representative... You then have a dictionary for each column and its corresponding index values.

df = pd.read_csv(io.StringIO("""       Columns    Place A      ...               Place Z
   Rows                        ...                         
    Place 1       1203.899830  ...               612.069576
    39             496.571302  ...               173.808444
    46            1061.913143  ...               488.477587
    48             978.166934  ...               412.830790
    57            1008.997612  ...               595.226911
    ...                   ...  ...                      ...
    20675         1705.383726  ...              1233.132858
    20676         1562.641365  ...               962.671417
    20678          433.199056  ...               451.470780
    20679          112.586538  ...               720.512590
    Place 20680    53.794611  ...               681.927796"""), sep="\s\s+", engine="python")\
    .drop(columns="...").drop([0,6])

df = df.astype({c:"float64" for c in df.columns if "Place" in c})
{c:df[df[c].lt(200)].index.tolist() for c in df.columns if "Place" in c}

output

{'Place A': [10, 11], 'Place Z': [2]}

Upvotes: 1

Related Questions