Reputation: 169
I am using scikit learn to understand machine learning. An introduction to machine learning with scikit-learn
Here the data is loaded into variable digits. digits.data gives us access to the data which is an 8 * 8 matrix. My question is what does the values in digits.data refers to, and why is the maximum value restricted to 16.
My best guess is its the gray scale value of each pixel, if so what is the difference between digits.data and digits.image
Thanks
Upvotes: 0
Views: 67
Reputation: 320
digits.image holds the raw images. digits.data hold the features (which in this case is simply the raw image, as you progress with the tutorial this will change to more sophisticated features). digits.data is shaped differently, in a way more natural to learning, where each row corresponds to a single image. Hence if you try:
import numpy as np
import matplotlib.pyplot as plt
plt.imshow(digits.images[0], cmap="gray")
and:
plt.imshow(np.reshape(digits.data[0, :], (8, 8)), cmap="gray")
you will get the same result.
Upvotes: 1