Tuning Experiments

In this tutorial, you learn how to tune an experiment. We tune the experiment which the tensorflow_mnist.py example script creates.

image

Prerequisites

Step 1. Run the experiment

In the examples/frameworks/tensorflow directory, run the experiment script:

python tensorflow_mnist.py

Step 2. Clone the experiment

Clone the experiment to create an editable copy that we can tune it. Do the following:

  1. In the Trains Web-App (UI), on the Projects page, click the examples project card.
  2. In the experiments table, right click the experiment Tensorflow v2 mnist with summaries.
  3. In the context menu, click Clone > CLONE. The newly cloned experiment appears and its info panel slides open.

Step 3. Tune the cloned experiment

To demonstrate tuning, change two hyperparameter values. Do the following:

  1. In the experiment info panel, click the CONFIGURATIONS tab.
  2. In PARAMETERS > General, hover and then click EDIT.
  3. Change the value of dropout from 0.9 to 0.7.
  4. Change the value of learning_rate from 0.001 to 0.05.
  5. Click SAVE

Step 4. Run a worker daemon listening to a queue

To execute the cloned experiment which we tuned, we need a running worker daemon listening to the default queue.

For more information about workers, worker daemons, and queues, see the "Concepts and Architecture" page, Workers and queues and Trains Agent.

Run the worker daemon on your local development machine by doing the following:

  1. Open a terminal session.

  2. Run the following trains-agent command which runs a worker daemon listening to the default queue.

    trains-agent daemon --queue default
    

    The response to this command is information about your configuration, the worker, and the queue. For example:

    Current configuration (trains_agent v0.13.0, location: /home/<username>/trains.conf):
    ----------------------
    agent.worker_id =
    agent.worker_name = LAPTOP-PPTKKPGK
    agent.python_binary =
    agent.package_manager.type = pip
    .
    .
    .
    sdk.development.worker.report_period_sec = 2
    sdk.development.worker.ping_period_sec = 30
    sdk.development.worker.log_stdout = true
    
    Worker "LAPTOP-PPTKKPGK:0" - Listening to queues:
    + ---------------------------------+---------+-------+
    | id                               | name    | tags  |
    + ---------------------------------+---------+-------+
    | 2a03daf5ff9a4255b9915fbd5306f924 | default |       |
    + ---------------------------------+---------+-------+
    
    Running TRAINS-AGENT daemon in background mode, writing stdout/stderr to /home/<username>/.trains_agent_daemon_outym6lqxrz.txt
    

Step 5. Enqueue the tuned experiment

Enqueue the tuned experiment. Do the following:

  1. In the Trains Web-App (UI), experiments table, right click the experiment Clone Of Tensorflow v2 mnist with summaries.
  2. In the context menu, click Enqueue.
  3. If the queue is not Default, in the queue list, select Default.
  4. Click ENQUEUE. The experiment's status becomes Pending. When the worker fetches the experiment from the queue, the status becomes Running, and you can view its progress in the info panel. When the status becomes Completed, go to the next step.

Step 6. Compare the experiments

To compare the original and tuned experiments, do the following:

  1. In the Trains Web-App (UI), on the Projects page, click the examples project.
  2. In the experiments table, select the checkboxes for our two experiments (Tensorflow mnist with summaries example and Clone Of Tensorflow v2 mnist with summaries).
  3. On the menu bar at the bottom of the experiment table, click COMPARE. The experiment comparison window appears. All differences appear with a different background color to highlight them.

    The experiment comparison window is organized in the following tabs:

    • DETAILS - The ARTIFACTS section, including input and output models with their network designs, and other artifacts; the EXECUTION section execution, including source code control, installed Python packages and versions, uncommitted changes, and the Docker image name which, in this case, is empty.
    • HYPER PARAMETERS - The hyperparameters and their values.
    • SCALARS - Scalar metrics with the option to view them as charts or values.
    • PLOTS - Plots of any data with the option to view them as charts or values.
    • DEBUG SAMPLES - Media including images, audio, and video uploaded by your experiment shown as thumbnails.
  4. Examine the differences in our two experiments by doing the following:

    1. Click the DETAILS > EXECUTION > Parameters. The hyperparameters dropout and learning_rate are shown with a different background color. The values are different.
    2. Click the SCALARS > To the right of Add Experiment, click Values. The scalar values appear as values showing the difference highlighted by background color.
    3. To the right of Add Experiment, click Chart. The scalar charts appear showing the difference.

Next Steps