Project Assignment
Post date: Dec 14, 2015 2:38:32 PM
An ultimate goal of the course project is to develop an automatic classifier based on machine learning techniques that is able to distinguish between 10 types of images available at the CIFAR-10 dataset: https://www.cs.toronto.edu/~kriz/cifar.html
The dataset consists of 32x32 color images of different animals and vehicles compressed to a single file. Every such image contains exactly one object in the foreground and is assigned a label, e.g. “cat”, “airplane”, or “frog”.
The data are split into five training batches and one test batch. As a matter of fact, the proposed classificator can only be trained with the training batches. The test batch will be used exclusively for evaluating purpose in order to assure a fair comparison.
A possible approach to address the above problem can be based on Convolutional Neural Network. A gentle introduction to this topic can be found at:
Alernatively, the proposed solution can utilize Support Vector Machine as presented in the following papers
Of course students are very welcome to search the web on their own and come up with other ideas.
The minimum requirement for completing the assignment and obtaining the grade '3' comprises of implementing a classificator that reaches at least 75% accuracy on the test data.
A proper solution must consist of the three following parts:
a piece of software (in the form of a source code),
test results obtained with the test batch described above,
a short description of the proposed algorithm containing at least 500 words in English (a quality of the English language will not be assessed).
A grade higher than '3' can be obtained by the students whose solutions meet the minimum requirement and who additionally completed (a part of) the following tasks:
Ad hoc improvement of a certain state-of-the-art solution
A student is expected to apply one of the canonical algorithms and suggest a modification of the original approach (e.g. by changing parts of a source code, tuning input parameters, building an ensemble algorithm) that would result in a statistically significant increase of accuracy and/or decrease of computation time or memory consumption when addressing the classification problem stated above. See: The list of proposed modifications.
Publishing partial results on blog and sharing a code repository
Partial results and short technical reports of student's endeavors must be posted in English on a dedicated blog at least once in two weeks. A student is obliged to create a blog and provide all of the participants of the course with the corresponding URL. Additionally, an up-to-date version of the source code must be made available through a git repository.
Final report
A student is supposed to present a full technical report of the final solution before the end of semester. The text must be written in English and be technically sound, i.e. in accordance with the rules of preparing a quality scientific paper as presented in
Deadline for submitting complete solutions is 1st February 2016.
The list of proposed modifications of the solution based on Convolutional Neural Network:
Training an alternative/smaller neural network that would emulate the target/bigger one as introduced in
A compression of weights of a neural network
Replacing fully connected layers of a neural network with less complex ones
Experiments with novel approaches to speeding up a training process
Identifying so-called adversarial examples
http://karpathy.github.io/2015/03/30/breaking-convnets/
Using alternative learning rules
Semi-supervised learning