General: MNIST - CNN example
1. This deep learning code is based on the code of convolutional neural networks (CNNs) from the Theano tutorial – it is good to get firstly familiar with Theano and deep learning in Theano:
Then, the essential is the tutorial on CNNs, since we use this code:
2. Code description (you can download the code at the bottom of this page)
There are 4 files:
· cnn.py – defines 3 classes: hidden layer, convolutional layer and the whole CNN
· logistic_sgd.py – auxiliary file, which contains the logistic regression class
· cnn_training_computation.py – Contains the definition of the training and prediction process.
- the Theano shared (shared memory on the GPU) variables, storing the datasets and labels
- CNN structural parameters
- training parameters
- training flow (method) and its auxiliary functions
- prediction flow and its auxiliary functions
You can access the CNN by ‘fit’ and ‘predict’ methods. The ‘fit’ method exports a file with the CNN weight, predict uses it to do predictions. You can specify your own path by passing a path as a ‘filename’ argument.
· run.py – the main program that executes our deep learning example. It reads the datasets, performs normalisation, trains the CNN and do predictions.
The datasets should be located with these files, in the ./data folder. The datasets can be obtained from the competition site: https://www.codalab.org/competitions/4061?secret_key=5f73902b-9459-4ef9-b761-1ee00bc5df9c
The dataset is a blurred version of the MNIST dataset.
To run, call:
It will create the predictions: mnist_valid.predict and mnist_test.predict files. Zip the two prediction files in a .zip archive and submit to the codalab platform:
REMEMBER -- NO FOLDERS IN THE .zip -- Do not zip a folder. Only these 2 files should be directly in the .zip.
3. Experiments: you can easily experiment with some training parameters (learning rate, number of epochs) or the number of units in the CNN layers (nkerns), by setting them in the cnn_training_computation.py, line 20.
If you would like to change the CNN architecture (e.g. add a third layer, change activation functions), you would need to do it in the cnn.py (requires understanding of the tutorials).