Reputation: 15
I have two image folders for skin cancer benign and malignant, I want to get the CSV file contains the first column is a path of the image and the second column is a label of the image in python language. how I can do that?
paths of dataset '../input/skin-cancer-malignant-vs-benign/train/benign'
'../input/skin-cancer-malignant-vs-benign/train/malignant'
Upvotes: 0
Views: 395
Reputation: 390
Check out the glob
module:
benign = glob.glob('{path to benign folder}/*.png')
malignant = glob.glob('{path to malignant folder}/*.png')
the * here just means take the file path for all .png files in this folder. of course change .png to whatever image format you are using.
Then it's just a matter of writing the data
import glob
benign = glob.glob('../input/skin-cancer-malignant-vs-benign/train/benign/*.png')
malignant = glob.glob('../input/skin-cancer-malignant-vs-benign/train/malignant/*.png')
CSV_FILE_NAME = 'my_file.csv'
with open(CSV_FILE_NAME, 'w') as f:
for path in benign:
f.write(path) # write the path in the first column
f.write(',') # separate first and second item by a comma
f.write('benign') # write the label in the second column
f.write('\n') # start a new line
for path in malignant:
f.write(path)
f.write(',')
f.write('malignant')
f.write('\n')
You can definitely write this more succinctly, but this is a bit more readable
Upvotes: 1