generating image features dataset in scikit-learn - csv file

Question

I extract 2 edge features (Hog feature and sobel operator) from a single image.

How can i create an image feature dataset in Scikit-learn python, like iris_dataset ? In the library there are csv files which represent datasets. A csv file containing only numbers. How were generate these numbers? feature extraction?

unfortunately i saw only a java tutorial here http://www.coccidia.icb.usp.br/coccimorph/tutorials/Tutorial-2-Creating-..., at 5 point talk about generating the training matrices (average and co-variance matrices)? There is any function in Scikit who generate these training arrays?

ogrisel · Accepted Answer

You don't need to wrap your data as a CSV file to load it as a dataset. scikit-learn models have a fit method that expects:

as first argument that is a regular numpy array (or scipy.sparse matrices) with shape (n_samples, n_features) (most often with dtype=numpy.float64) to encode the features vector for each sample in the training set,
and for supervised classification models, a second argument with shape (n_samples,) and dtype=numpy.int32 to encode the class label assignments encoded as integer values for each sample of the training set.

If you don't know the basic numpy datastructure and what shape and dtype mean, I stongly advise you to have a look at a tutorial such as SciPy Lecture Notes.

Edit: If you really need to read / write numerical CSV to / from numpy arrays, you can use numpy.loadtxt / numpy.savetxt

generating image features dataset in scikit-learn - csv file

Answers (1)

Related Questions