MNIST#

class MNIST(root=None, train=True, download=True)[source]#

MNIST dataset. The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by “re-mixing” the samples from NIST’s original datasets. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels. The MNIST database contains 60,000 training images and 10,000 testing images.

The above introduction comes from MNIST database - Wikipedia.

Parameters:
  • root (Optional[str]) – Path for MNIST dataset downloading or loading. If it’s None, it will be set to ~/.cache/megengine (the default root path).

  • train (bool) – If True, use traning dataset; Otherwise use the test set.

  • download (bool) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

Returns:

The MNIST Dataset that can work with DataLoader.

Example

>>> from megengine.data.dataset import MNIST   
>>> mnist = MNIST("/data/datasets/MNIST")  # Set the root path   
>>> image, label = mnist[0]  
>>> image.shape   
(28, 28, 1)

Changed in version 1.11: The original URL has been updated to a mirror URL

”Please refrain from accessing these files from automated scripts with high frequency. Make copies!” As requested by the original provider of the MNIST dataset, now the dataset will be downloaded from the mirror site: https://ossci-datasets.s3.amazonaws.com/mnist/

See also

  • MNIST dataset is used in Getting started with MegEngine tutorial as an example.

  • You can find a lot of machine learning projects using MNIST dataset on the internet.