Reputation: 933

Are there similar datasets to MNIST?

I am doing research on machine learning. Now I want to test my algorithms with some famous datasets. Since I am a newbie in this area, I can't find other suitable datasets apart from MNIST. I thing MNIST is quite suitable for our research. Does anyone know some similar datasets with MNIST?

P.S I know another handwritten digit dataset that is often used, called USPS dataset. But I need a dataset with more training examples (typically more than 10000 and comparable to the number of training examples in MNIST), so USPS is out of my selection.

Upvotes: 9

Answers (3)

Joy

Reputation: 97

I know this question is old, but I hope my suggestions can still be useful. I was also looking for datasets similar to handwritten MNIST and Fashion MINIST as well. Pytorch provides several of them with documentation: KMNIST, QMNIST, USPS, SEMEION, SVHN, amongst others. Check here for the full list.

Upvotes: 1

aliakbars

Reputation: 61

You can try Fashion MNIST or Kuzushiji MNIST that have very similar properties to MNIST, but a bit harder to predict. From Fashion MNIST's page:

Seriously, we are talking about replacing MNIST. Here are some good reasons:

MNIST is too easy. Convolutional nets can achieve 99.7% on MNIST. Classic machine learning algorithms can also achieve 97% easily. Check out our side-by-side benchmark for Fashion-MNIST vs. MNIST, and read "Most pairs of MNIST digits can be distinguished pretty well by just one pixel."

MNIST is overused. In this April 2017 Twitter thread, Google Brain research scientist and deep learning expert Ian Goodfellow calls for people to move away from MNIST.

MNIST can not represent modern CV tasks, as noted in this April 2017 Twitter thread, deep learning expert/Keras author François Chollet.

Upvotes: 5

corrin

Reputation: 63

The machine learning archive (http://archive.ics.uci.edu/ml/) contains quite a variety of datasets including those, like MINIST, suitable for classification e.g. (http://archive.ics.uci.edu/ml/datasets/Skin+Segmentation).

I can't say which of them would be suitable without knowing what you're trying to demonstrate with your algorithm but anything inside the UCI archive is well known.

Upvotes: 5

Are there similar datasets to MNIST?

Answers (3)

Related Questions