Reputation: 21
Apologies, I'm new to python Pandas and therefore it might be possible that I'm not thinking in the correct way.
The following dataframe shows the distances calculated between every value of 2 lists. Let's suppose the two lists contain 100 values, then the Dataframe consists out of 100*100 distances calculated.
The Dataframe looks as following:
Columns Place A ... Place Z
Rows ...
Place 1 1203.899830 ... 612.069576
39 496.571302 ... 173.808444
46 1061.913143 ... 488.477587
48 978.166934 ... 412.830790
57 1008.997612 ... 595.226911
... ... ... ...
20675 1705.383726 ... 1233.132858
20676 1562.641365 ... 962.671417
20678 433.199056 ... 451.470780
20679 112.586538 ... 720.512590
Place 20680 53.794611 ... 681.927796
[4068 rows x 179 columns]
I would like to get every place (1 till 20680) that is less than 100 km listed under Place A till Z. So the 100 km filter has to be applied individually for every column.
For example, Place 1 is 1200 km from Place A so it shouldn't appear in the 'Place 1' column. Place 39 is located just 30 km of Place X, so Place 39 should appear in the 'Place X' column.
In other words, the indices of the values < 100 km have to be listed in the different columns. I understand that the columns will become of different lengths in this way.
Till so far, I got it done for just one column but I would like to achieve it for every column:
Input: Index_label = df_dist_matrix[df_dist_matrix['Place A']< 100].index.tolist()
print(Index_label)
Output:
[175, 252, 379, 398, 455, 486, 523,....., Place 20680]
Some help would be greatly appreciated, thanks!
Upvotes: 0
Views: 106
Reputation: 31156
I took 200km as your sample data is not so representative... You then have a dictionary for each column and its corresponding index values.
df = pd.read_csv(io.StringIO(""" Columns Place A ... Place Z
Rows ...
Place 1 1203.899830 ... 612.069576
39 496.571302 ... 173.808444
46 1061.913143 ... 488.477587
48 978.166934 ... 412.830790
57 1008.997612 ... 595.226911
... ... ... ...
20675 1705.383726 ... 1233.132858
20676 1562.641365 ... 962.671417
20678 433.199056 ... 451.470780
20679 112.586538 ... 720.512590
Place 20680 53.794611 ... 681.927796"""), sep="\s\s+", engine="python")\
.drop(columns="...").drop([0,6])
df = df.astype({c:"float64" for c in df.columns if "Place" in c})
{c:df[df[c].lt(200)].index.tolist() for c in df.columns if "Place" in c}
output
{'Place A': [10, 11], 'Place Z': [2]}
Upvotes: 1