Format & Download

Original MNIST dataset

Test data

For active learning, test data consists of the whole training data from the original dataset (used to select the subset of samples used for learning), 10 different initial sets of labeled samples and a test set. This test set is provided only for validation purposes. The final test set used for final evaluation of methods will be blind for participants.

For online learning, test data consists of a set of 10.000 samples for which the labels has to be predicted sequentially.

 

Feature extraction

Images have been processed to obtain one set of common features. The feature extraction method consists of applying PCA to the original images, yielding to feature vectors of 50 dimensions. Participants will have to use this common set of features to guarantee a fair comparison between methods focusing on active and online learning.

Download

Training data for active learning: matlabtext file

Test set for validation active learning: matlabtext file

Initial sets of labeled data: matlabtext file

Test data for online learning: matlabtext file