Preparing image data for PCA

Question

Hi I tried to apply PCA on a folder with many pics inside (.jpg). However, I stuck on converting it to the format that scikit-learn PCA accepts. It seems that PCA takes array data format. I read articles like PCA for image data but it looks quite complicated for me. I just want to convert images to accepted format then use pca.fit

Before I used os.walk to change images to gray scales and resize them (as below). I was wondering if I can use it on PCA as well.

from sklearn.decomposition import PCA
from PIL import Image 
import os
import numpy as np

WORK_DIR = 'D:/folder/' #working folder
source = os.path.join(WORK_DIR, 'train')  
target = os.path.join(WORK_DIR, 'gray')  

for root, dirpath, filenames in os.walk(source):
    for file in filenames:
        image_file = Image.open(os.path.join(root, file))
        image_file.draft('L', (256, 128)) 
        image_file.save(os.path.join(target, file))

Any other easier methods will be great too.

Preparing image data for PCA

Answers (1)

Related Questions