AutoML

In this tutorial you learn how to implement autoML (automated machine learning) using Trains.

To demonstrate autoML, we create a parameter search. We automate the execution of an experiment with different hyperparameter values by using another Python script to clone (make copies) of that experiment, set different hyperparameter values in each clone, and enqueue the cloned tasks to run.

We use two Python scripts which are in the trains repository examples/automl directory:

  • automl_base_template_keras_simple.py - A simple deep learning MNIST solution using Keras and TensorBoard. This script creates an experiment named Keras AutoML base. This is the experiment that is cloned. Its hyperparameter values will be set by the other Task. Therefore, we refer to it as the "base" Task or "base" experiment.
  • automl_random_search_example.py - A Python script designed to clone Keras AutoML base, set the hyperparameters in the clone, and enqueue the clone to run.

Prerequisites

Step 1. The base Task script

In this tutorial's autoML example, the script for our base Task (automl_base_template_keras_simple.py) requires a hyperparameter dictionary and that dictionary must be connected to the Task. Call the Task.connect() method to connect the dictionary to the Task.

args = {'batch_size': 128,
        'epochs': 6,
        'layer_1': 512,
        'layer_2': 512,
        'layer_3': 10,
        'layer_4': 512,
        }
args = task.connect(args)

Step 2. Run base Task script

In your local trains repository, examples directory, run the script.

python automl_base_template_keras_simple.py

After the script runs, the experiment Keras AutoML base exists in Trains.

Viewing the experiment

When the script runs, these hyperparameters will be available to you in the Trains Web-App, experiment details panel, HYPER PARAMETERS tab. For more information, see Experiment Details in the User Interface (Web-App) section.

Step 3. The parameter search script

First, in automl_random_search_example.py, we create a dictionary of hyperparameters that we will use for each clone of the experiment Keras AutoML base.

# define random search space,
# This is a simple random search
# (can be integrated with 'bayesian-optimization' 'hpbandster' etc.)
space = {
    'batch_size': lambda: sample([64, 96, 128, 160, 192], 1)[0],
    'layer_1': lambda: sample(range(128, 512, 32), 1)[0],
    'layer_2': lambda: sample(range(128, 512, 32), 1)[0],
}

# number of random samples to test from 'space'
total_number_of_experiments = 3

Get a reference to the Keras AutoML base experiment (which we added to Trains in the previous step) by calling the Task.get_task() method with the Task name and project name.

template_task = Task.get_task(project_name='examples', task_name='Keras AutoML base')

For each set of hyperparameters, call the Task.clone() method with a reference to Keras AutoML base. This creates a new experiment which is clone (copy) of Keras AutoML base.

for i in range(params['total_number_of_experiments']):
    # clone the template task into a new write enabled task (where we can change parameters)
    cloned_task = Task.clone(source_task=template_task,
                             name=template_task.name+' {}'.format(i), parent=template_task.id)

Using cloned_task which is a reference to the cloned experiment in Trains, call the Task.get_parameters() method to get a reference to the cloned experiment's hyperparameters and then call the Task.set_parameters() method to set them to new values.

    # get the original template parameters
    cloned_task_parameters = cloned_task.get_parameters()

    # override with random samples form grid
    for k in space.keys():
        cloned_task_parameters[k] = space[k]()

    # put back into the new cloned task
    cloned_task.set_parameters(cloned_task_parameters)
    print('Experiment {} set with parameters {}'.format(i, cloned_task_parameters))

Finally, use the Task.enqueue() method to enqueue the cloned Task to run.

    Task.enqueue(cloned_task.id, queue_name=execution_queue_name)

Step 4. Run the parameter search script

Run the automl_random_search_example.py script.

python automl_random_search_example.py

This creates experiments whose name is in the form Keras AutoML base.<Task Id>. For example, Keras AutoML base.60ab7c09993c34b57a51e99943c530072. Each clone runs with a difference set of hyperparameters from the dictionary created in parameter search script (see Step 4).