Reputation: 23
Firstly, my apologies if this question is too simple / obvious.
My question is:
I am using nested loops to check whether certain images are listed in a dataframe ('old_df'). If they are present, I add them to an empty list ('new_list').
Is there a faster or more performant way to do this?
images = []
for root, dirs, files in os.walk('/gdrive/MyDrive/CNN_Tute/data/images/'):
for file in files:
images.append(file)
new_list = []
for i in range(len(images)):
for j in range(len(old_df)):
if images[i] == old_df.iloc[j, 0]:
new_list.append(old_df.iloc[j, :])
Upvotes: 1
Views: 60
Reputation: 863501
If want test first column by position:
images = [file for root, dirs, files in os.walk('/gdrive/MyDrive/CNN_Tute/data/images/'
for file in files]
new_list = old_df.iloc[old_df.iloc[:, 0].isin(images).to_numpy(), 0].tolist()
Upvotes: 2
Reputation: 1669
You can achieve this in two lines:
images = [file for _, _, files in os.walk('/gdrive/MyDrive/CNN_Tute/data/images/' for file in files]
new_labels_df = xr_df[xr_df[[0]].isin(images)]
Upvotes: 0