Damir Jajetic

Damir Jajetic is a software engineer from Croatia.

He participates often in competitions (see his Kaggle ratings), including 3 other ChaLearn competitions (Cause Effect Pairs, Connectomics, and Higgs Boson). Naturally, he is interested in AutoML!

Here is how he describes his contribution to the AutoML challenge:

GPU:

GPU Neural network based model on Lasagne and Theano libraries. Very simple and self-explanatory source code is available at https://github.com/djajetic/GPU_djajetic.

CPU:

Software is simple and based of ensembled unsynchronized local search models without any communication between models (let’s call it particles, although not PSO, just for easy reading) or exploiting any properties that could be obtained by swarm intelligence and is consequently insensitive of swarm false believes in non-convex search spaces.

Search space and ensembling properties for each individual particle is defined in separate python script and in this form can be defined dynamically . As particles are created they are unaware of any other outside information except 2 stop signals yielding best results they can on dataset and reporting precision on training subset.

Ensembling module will use only N best modes from particles based on reported precision and only M best models from each model group (that is defined in “model definition script” by assigning similar models in same group). That makes parallelization almost linear with constraint that particle should have ability to handle creating model and producing results on test datasets by itself to be included in ensemble. As there is no feasibility heuristic except if one include it in model definition script, all unfinished models are (only) wasted computing time.

Some things must be slightly changed for parallelization with infinite scalability (ie. current implementation works only on single machine, ensembling module should have ie. pruning submodule etc.)