automation.optimization.HyperParameterOptimizer

class trains.automation.optimization.HyperParameterOptimizer

Hyper-parameter search controller. Clones the base experiment, changes arguments and tries to maximize/minimize the defined objective.

Create a new hyper-parameter controller. The newly created object will launch and monitor the new experiments.

Parameters
  • base_task_id (str) – The Task ID to be used as template experiment to optimize.

  • hyper_parameters (list) – The list of Parameter objects to optimize over.

  • objective_metric_title (str) – The Objective metric title to maximize / minimize (for example, validation).

  • objective_metric_series (str) – The Objective metric series to maximize / minimize (for example, loss).

  • objective_metric_sign (str) –

    The objective to maximize / minimize.

    The values are:

    • min - Minimize the last reported value for the specified title/series scalar.

    • max - Maximize the last reported value for the specified title/series scalar.

    • min_global - Minimize the min value of all reported values for the specific title/series scalar.

    • max_global - Maximize the max value of all reported values for the specific title/series scalar.

  • optimizer_class (class.SearchStrategy) – The SearchStrategy optimizer to use for the hyper-parameter search

  • max_number_of_concurrent_tasks (int) – The maximum number of concurrent Tasks (experiments) running at the same time.

  • execution_queue (str) – The execution queue to use for launching Tasks (experiments).

  • optimization_time_limit (float) – The maximum time (minutes) for the entire optimization process. The default is None, indicating no time limit.

  • compute_time_limit (float) – The maximum compute time in minutes. When time limit is exceeded, all jobs aborted. (Optional)

  • auto_connect_task (bool) –

    Store optimization arguments and configuration in the Task

    The values are:

    • True - The optimization argument and configuration will be stored in the Task. All arguments will be under the hyper-parameter section as opt/<arg>, and the hyper_parameters will stored in the Task connect_configuration (see artifacts/hyper-parameter).

    • False - Do not store with Task.

  • always_create_task (bool) –

    Always create a new Task

    The values are:

    • True - No current Task initialized. Create a new task named optimization in the base_task_id project.

    • False - Use the task.Task.current_task (if exists) to report statistics.

  • optimizer_kwargs (**) –

    Arguments passed directly to the optimizer constructor.

    Example:

    :linenos:
     :caption: Example
     
     from trains import Task
     from trains.automation import UniformParameterRange, DiscreteParameterRange
     from trains.automation import GridSearch, RandomSearch, HyperParameterOptimizer
     
     task = Task.init('examples', 'HyperParameterOptimizer example')
     an_optimizer = HyperParameterOptimizer(
         base_task_id='fa30fa45d95d4927b87c323b5b04dc44',
         hyper_parameters=[
             UniformParameterRange('lr', min_value=0.01, max_value=0.3, step_size=0.05),
             DiscreteParameterRange('network', values=['ResNet18', 'ResNet50', 'ResNet101']),
         ],
         objective_metric_title='title',
         objective_metric_series='series',
         objective_metric_sign='min',
         max_number_of_concurrent_tasks=5,
         optimizer_class=RandomSearch,
         execution_queue='workers', time_limit_per_job=120, pool_period_min=0.2)
     
     # This will automatically create and print the optimizer new task id
     # for later use. if a Task was already created, it will use it.
     an_optimizer.set_time_limit(in_minutes=10.)
     an_optimizer.start()
     # we can create a pooling loop if we like
     while not an_optimizer.reached_time_limit():
         top_exp = an_optimizer.get_top_experiments(top_k=3)
         print(top_exp)
     # wait until optimization completed or timed-out
     an_optimizer.wait()
     # make sure we stop all jobs
     an_optimizer.stop()
     

elapsed()

Return minutes elapsed from controller stating time stamp.

Returns

The minutes from controller start time. A negative value means the process has not started yet.

get_active_experiments()

Return a list of Tasks of the current active experiments.

Returns

A list of Task objects, representing the current active experiments.

get_num_active_experiments()

Return the number of current active experiments.

Returns

The number of active experiments.

get_optimizer()

Return the currently used optimizer object.

Returns

The SearchStrategy object used.

get_time_limit()

Return the controller optimization time limit.

Returns

The absolute datetime limit of the controller optimization process.

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

Parameters

top_k (int) – The number of Tasks (experiments) to return.

Returns

A list of Task objects, ordered by performance, where index 0 is the best performing Task.

is_active()

Is the optimization procedure active (still running)

The values are:

  • True - The optimization procedure is active (still running).

  • False - The optimization procedure is not active (not still running).

Note

If the daemon thread has not yet started, is_active returns True.

Returns

A boolean indicating whether the optimization procedure is active (still running) or stopped.

is_running()

Is the optimization controller is running

The values are:

  • True - The optimization procedure is running.

  • False - The optimization procedure is running.

Returns

A boolean indicating whether the optimization procedure is active (still running) or stopped.

reached_time_limit()

Did the optimizer reach the time limit

The values are:

  • True - The time limit passed.

  • False - The time limit did not pass.

This method returns immediately, it does not wait for the optimizer.

Returns

True, if optimizer is running and we passed the time limit, otherwise returns False.

set_default_job_class(job_class)

Set the Job class to use when the optimizer spawns new Jobs.

Parameters

job_class (TrainsJob) – The Job Class type.

set_report_period(report_period_minutes)

Set reporting period for the accumulated objective report (minutes). This report is sent on the Optimizer Task, and collects the Objective metric from all running jobs.

Parameters

report_period_minutes (float) – The reporting period (minutes). The default is once every 10 minutes.

set_time_limit(in_minutes=None, specific_time=None)

Set a time limit for the HyperParameterOptimizer controller. If we reached the time limit, stop the optimization process. If specific_time is provided, use it; otherwise, use the in_minutes.

Parameters
  • in_minutes (float) – The maximum processing time from current time (minutes).

  • specific_time (datetime) – The specific date/time limit.

start(job_complete_callback=None)

Start the HyperParameterOptimizer controller. If the calling process is stopped, then the controller stops as well.

Parameters

job_complete_callback (Callable) –

Callback function, called when a job is completed.

def job_complete_callback(
     job_id,                 # type: str
     objective_value,        # type: float
     objective_iteration,    # type: int
     job_parameters,         # type: dict
     top_performance_job_id  # type: str
 ):
     pass
 

Returns

True, if the controller started. False, if the controller did not start.

stop(timeout=None, wait_for_reporter=True)

Stop the HyperParameterOptimizer controller and the optimization thread.

Parameters
  • timeout (float) – Wait timeout for the optimization thread to exit (minutes). The default is None, indicating do not wait terminate immediately.

  • wait_for_reporter – Wait for reporter to flush data.

wait(timeout=None)

Wait for the optimizer to finish.

Note

This method does not stop the optimizer. Call stop to terminate the optimizer.

Parameters

timeout (float) – The timeout to wait for the optimization to complete (minutes). If None, then wait until we reached the timeout, or optimization completed.

Returns

True, if the optimization finished. False, if the optimization timed out.