Automation Module

class trains.automation.optimization.Objective(title, series, order='max', extremum=False)

Optimization Objective class to maximize / minimize over all experiments. This class will sample a specific scalar from all experiments, and maximize / minimize over single scalar (i.e., title and series combination).

SearchStrategy and HyperParameterOptimizer use Objective in the strategy search algorithm.

Construct Objective object that will return the scalar value for a specific task ID.

Parameters
  • title (str) – The scalar graph title to sample from.

  • series (str) – The scalar series title to sample from.

  • order (str) –

    The setting for maximizing or minimizing the objective scalar value.

    The values are:

    • max

    • min

  • extremum (bool) –

    Return the global minimum / maximum reported metric value?

    The values are:

    • True - Return the global minimum / maximum reported metric value.

    • False - Return the last value reported for a specific Task. (Default)

get_objective(task_id)

Return a specific task scalar value based on the objective settings (title/series).

Parameters

task_id (str) – The Task id to retrieve scalar from (or TrainsJob object).

Returns

The scalar value.

get_current_raw_objective(task)

Return the current raw value (without sign normalization) of the objective.

Parameters

task (str) – The Task or Job to retrieve scalar from (or TrainsJob object).

Returns

Tuple(iteration, value) if, and only if, the metric exists. None if the metric does not exist.

get_objective_sign()

Return the sign of the objective.

  • +1 - If maximizing

  • -1 - If minimizing

Returns

Objective function sign.

get_objective_metric()

Return the metric title, series pair of the objective.

Returns

(title, series)

get_normalized_objective(task_id)

Return a normalized task scalar value based on the objective settings (title/series). I.e. objective is always to maximize the returned value

Parameters

task_id (str) – The Task id to retrieve scalar from.

Returns

Normalized scalar value.

class trains.automation.optimization.SearchStrategy(base_task_id, hyper_parameters, objective_metric, execution_queue, num_concurrent_workers, pool_period_min=2.0, time_limit_per_job=None, max_iteration_per_job=None, total_max_jobs=None, **_)

The base search strategy class. Inherit this class to implement your custom strategy.

Initialize a search strategy optimizer.

Parameters
  • base_task_id (str) – The Task ID (str)

  • hyper_parameters (list) – The list of parameter objects to optimize over.

  • objective_metric (Objective) – The Objective metric to maximize / minimize.

  • execution_queue (str) – The execution queue to use for launching Tasks (experiments).

  • num_concurrent_workers (int) – The maximum number of concurrent running machines.

  • pool_period_min (float) – The time between two consecutive pools (minutes).

  • time_limit_per_job (float) – The maximum execution time per single job in minutes. When time limit is exceeded, the job is aborted. (Optional)

  • max_iteration_per_job (int) – The maximum iterations (of the Objective metric) per single job. When maximum iterations is exceeded, the job is aborted. (Optional)

  • total_max_jobs (int) – The total maximum jobs for the optimization process. The default value is None, for unlimited.

start()

Start the Optimizer controller function loop(). If the calling process is stopped, the controller will stop as well.

Important

This function returns only after the optimization is completed or stop() was called.

stop()

Stop the current running optimization loop. Called from a different thread than the start().

process_step()

Abstract helper function. Implementation is not required. Default use in start default implementation Main optimization loop, called from the daemon thread created by start().

  • Call monitor job on every TrainsJob in jobs:

    • Check the performance or elapsed time, and then decide whether to kill the jobs.

  • Call create_job:

    • Check if spare job slots exist, and if they do call create a new job based on previous tested experiments.

Returns

True, if continue the optimization. False, if immediately stop.

create_job()

Abstract helper function. Implementation is not required. Default use in process_step default implementation Create a new job if needed. return the newly created job. If no job needs to be created, return None.

Returns

A Newly created TrainsJob object, or None if no TrainsJob created.

monitor_job(job)

Helper function, Implementation is not required. Default use in process_step default implementation. Check if the job needs to be aborted or already completed.

If returns False, the job was aborted / completed, and should be taken off the current job list

If there is a budget limitation, this call should update self.budget.compute_time.update / self.budget.iterations.update

Parameters

job (TrainsJob) – A TrainsJob object to monitor.

Returns

False, if the job is no longer relevant.

get_running_jobs()

Return the current running TrainsJobs.

Returns

List of TrainsJob objects.

get_created_jobs_ids()

Return a Task IDs dict created by this optimizer until now, including completed and running jobs. The values of the returned dict are the parameters used in the specific job

Returns

dict of task IDs (str) as keys, and their parameters dict as values.

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

Parameters

top_k (int) – The number of Tasks (experiments) to return.

Returns

A list of Task objects, ordered by performance, where index 0 is the best performing Task.

get_objective_metric()

Return the metric title, series pair of the objective.

Returns

(title, series)

helper_create_job(base_task_id, parameter_override=None, task_overrides=None, tags=None, parent=None, **kwargs)

Create a Job using the specified arguments, TrainsJob for details.

Returns

A newly created Job instance.

set_job_class(job_class)

Set the class to use for the helper_create_job() function.

Parameters

job_class (TrainsJob) – The Job Class type.

set_job_default_parent(job_parent_task_id)

Set the default parent for all Jobs created by the helper_create_job() method.

Parameters

job_parent_task_id (str) – The parent Task ID.

set_job_naming_scheme(naming_function)

Set the function used to name a newly created job.

Parameters

naming_function (callable) –

naming_functor(base_task_name, argument_dict) -> str

class trains.automation.optimization.GridSearch(base_task_id, hyper_parameters, objective_metric, execution_queue, num_concurrent_workers, pool_period_min=2.0, time_limit_per_job=None, max_iteration_per_job=None, total_max_jobs=None, **_)

Grid search strategy controller. Full grid sampling of every hyper-parameter combination.

Initialize a grid search optimizer

Parameters
  • base_task_id (str) – The Task ID.

  • hyper_parameters (list) – The list of parameter objects to optimize over.

  • objective_metric (Objective) – The Objective metric to maximize / minimize.

  • execution_queue (str) – The execution queue to use for launching Tasks (experiments).

  • num_concurrent_workers (int) – The maximum number of concurrent running machines.

  • pool_period_min (float) – The time between two consecutive pools (minutes).

  • time_limit_per_job (float) – The maximum execution time per single job in minutes. When the time limit is exceeded job is aborted. (Optional)

  • max_iteration_per_job (int) – The maximum iterations (of the Objective metric) per single job, When exceeded, the job is aborted.

  • total_max_jobs (int) – The total maximum jobs for the optimization process. The default is None, for unlimited.

create_job()

Create a new job if needed. Return the newly created job. If no job needs to be created, return None.

Returns

A newly created TrainsJob object, or None if no TrainsJob is created.

get_created_jobs_ids()

Return a Task IDs dict created by this optimizer until now, including completed and running jobs. The values of the returned dict are the parameters used in the specific job

Returns

dict of task IDs (str) as keys, and their parameters dict as values.

get_objective_metric()

Return the metric title, series pair of the objective.

Returns

(title, series)

get_running_jobs()

Return the current running TrainsJobs.

Returns

List of TrainsJob objects.

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

Parameters

top_k (int) – The number of Tasks (experiments) to return.

Returns

A list of Task objects, ordered by performance, where index 0 is the best performing Task.

helper_create_job(base_task_id, parameter_override=None, task_overrides=None, tags=None, parent=None, **kwargs)

Create a Job using the specified arguments, TrainsJob for details.

Returns

A newly created Job instance.

monitor_job(job)

Helper function, Implementation is not required. Default use in process_step default implementation. Check if the job needs to be aborted or already completed.

If returns False, the job was aborted / completed, and should be taken off the current job list

If there is a budget limitation, this call should update self.budget.compute_time.update / self.budget.iterations.update

Parameters

job (TrainsJob) – A TrainsJob object to monitor.

Returns

False, if the job is no longer relevant.

process_step()

Abstract helper function. Implementation is not required. Default use in start default implementation Main optimization loop, called from the daemon thread created by start().

  • Call monitor job on every TrainsJob in jobs:

    • Check the performance or elapsed time, and then decide whether to kill the jobs.

  • Call create_job:

    • Check if spare job slots exist, and if they do call create a new job based on previous tested experiments.

Returns

True, if continue the optimization. False, if immediately stop.

set_job_class(job_class)

Set the class to use for the helper_create_job() function.

Parameters

job_class (TrainsJob) – The Job Class type.

set_job_default_parent(job_parent_task_id)

Set the default parent for all Jobs created by the helper_create_job() method.

Parameters

job_parent_task_id (str) – The parent Task ID.

set_job_naming_scheme(naming_function)

Set the function used to name a newly created job.

Parameters

naming_function (callable) –

naming_functor(base_task_name, argument_dict) -> str

start()

Start the Optimizer controller function loop(). If the calling process is stopped, the controller will stop as well.

Important

This function returns only after the optimization is completed or stop() was called.

stop()

Stop the current running optimization loop. Called from a different thread than the start().

class trains.automation.optimization.RandomSearch(base_task_id, hyper_parameters, objective_metric, execution_queue, num_concurrent_workers, pool_period_min=2.0, time_limit_per_job=None, max_iteration_per_job=None, total_max_jobs=None, **_)

Random search strategy controller. Random uniform sampling of hyper-parameters.

Initialize a random search optimizer.

Parameters
  • base_task_id (str) – The Task ID.

  • hyper_parameters (list) – The list of Parameter objects to optimize over.

  • objective_metric (Objective) – The Objective metric to maximize / minimize.

  • execution_queue (str) – The execution queue to use for launching Tasks (experiments).

  • num_concurrent_workers (int) – The maximum umber of concurrent running machines.

  • pool_period_min (float) – The time between two consecutive pools (minutes).

  • time_limit_per_job (float) – The maximum execution time per single job in minutes, when time limit is exceeded job is aborted. (Optional)

  • max_iteration_per_job (int) – The maximum iterations (of the Objective metric) per single job. When exceeded, the job is aborted.

  • total_max_jobs (int) – The total maximum jobs for the optimization process. The default is None, for unlimited.

create_job()

Create a new job if needed. Return the newly created job. If no job needs to be created, return None.

Returns

A newly created TrainsJob object, or None if no TrainsJob created

get_created_jobs_ids()

Return a Task IDs dict created by this optimizer until now, including completed and running jobs. The values of the returned dict are the parameters used in the specific job

Returns

dict of task IDs (str) as keys, and their parameters dict as values.

get_objective_metric()

Return the metric title, series pair of the objective.

Returns

(title, series)

get_running_jobs()

Return the current running TrainsJobs.

Returns

List of TrainsJob objects.

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

Parameters

top_k (int) – The number of Tasks (experiments) to return.

Returns

A list of Task objects, ordered by performance, where index 0 is the best performing Task.

helper_create_job(base_task_id, parameter_override=None, task_overrides=None, tags=None, parent=None, **kwargs)

Create a Job using the specified arguments, TrainsJob for details.

Returns

A newly created Job instance.

monitor_job(job)

Helper function, Implementation is not required. Default use in process_step default implementation. Check if the job needs to be aborted or already completed.

If returns False, the job was aborted / completed, and should be taken off the current job list

If there is a budget limitation, this call should update self.budget.compute_time.update / self.budget.iterations.update

Parameters

job (TrainsJob) – A TrainsJob object to monitor.

Returns

False, if the job is no longer relevant.

process_step()

Abstract helper function. Implementation is not required. Default use in start default implementation Main optimization loop, called from the daemon thread created by start().

  • Call monitor job on every TrainsJob in jobs:

    • Check the performance or elapsed time, and then decide whether to kill the jobs.

  • Call create_job:

    • Check if spare job slots exist, and if they do call create a new job based on previous tested experiments.

Returns

True, if continue the optimization. False, if immediately stop.

set_job_class(job_class)

Set the class to use for the helper_create_job() function.

Parameters

job_class (TrainsJob) – The Job Class type.

set_job_default_parent(job_parent_task_id)

Set the default parent for all Jobs created by the helper_create_job() method.

Parameters

job_parent_task_id (str) – The parent Task ID.

set_job_naming_scheme(naming_function)

Set the function used to name a newly created job.

Parameters

naming_function (callable) –

naming_functor(base_task_name, argument_dict) -> str

start()

Start the Optimizer controller function loop(). If the calling process is stopped, the controller will stop as well.

Important

This function returns only after the optimization is completed or stop() was called.

stop()

Stop the current running optimization loop. Called from a different thread than the start().

class trains.automation.optimization.HyperParameterOptimizer(base_task_id, hyper_parameters, objective_metric_title, objective_metric_series, objective_metric_sign='min', optimizer_class=<class 'trains.automation.optimization.RandomSearch'>, max_number_of_concurrent_tasks=10, execution_queue='default', optimization_time_limit=None, auto_connect_task=True, always_create_task=False, **optimizer_kwargs)

Hyper-parameter search controller. Clones the base experiment, changes arguments and tries to maximize/minimize the defined objective.

Create a new hyper-parameter controller. The newly created object will launch and monitor the new experiments.

Parameters
  • base_task_id (str) – The Task ID to be used as template experiment to optimize.

  • hyper_parameters (list) – The list of Parameter objects to optimize over.

  • objective_metric_title (str) – The Objective metric title to maximize / minimize (for example, validation).

  • objective_metric_series (str) – The Objective metric series to maximize / minimize (for example, loss).

  • objective_metric_sign (str) –

    The objective to maximize / minimize.

    The values are:

    • min - Minimize the last reported value for the specified title/series scalar.

    • max - Maximize the last reported value for the specified title/series scalar.

    • min_global - Minimize the min value of all reported values for the specific title/series scalar.

    • max_global - Maximize the max value of all reported values for the specific title/series scalar.

  • optimizer_class (class.SearchStrategy) – The SearchStrategy optimizer to use for the hyper-parameter search

  • max_number_of_concurrent_tasks (int) – The maximum number of concurrent Tasks (experiments) running at the same time.

  • execution_queue (str) – The execution queue to use for launching Tasks (experiments).

  • optimization_time_limit (float) – The maximum time (minutes) for the entire optimization process. The default is None, indicating no time limit.

  • auto_connect_task (bool) –

    Store optimization arguments and configuration in the Task?

    The values are:

    • True - The optimization argument and configuration will be stored in the Task. All arguments will be under the hyper-parameter section as opt/<arg>, and the hyper_parameters will stored in the Task connect_configuration (see artifacts/hyper-parameter).

    • False - Do not store with Task.

  • always_create_task (bool) –

    Always create a new Task?

    The values are:

    • True - No current Task initialized. Create a new task named optimization in the base_task_id project.

    • False - Use the task.Task.current_task() (if exists) to report statistics.

  • optimizer_kwargs (**) –

    Arguments passed directly to the optimizer constructor.

    Example:

    :linenos:
    :caption: Example
    
    from trains import Task
    from trains.automation import UniformParameterRange, DiscreteParameterRange
    from trains.automation import GridSearch, RandomSearch, HyperParameterOptimizer
    
    task = Task.init('examples', 'HyperParameterOptimizer example')
    an_optimizer = HyperParameterOptimizer(
        base_task_id='fa30fa45d95d4927b87c323b5b04dc44',
        hyper_parameters=[
            UniformParameterRange('lr', min_value=0.01, max_value=0.3, step_size=0.05),
            DiscreteParameterRange('network', values=['ResNet18', 'ResNet50', 'ResNet101']),
        ],
        objective_metric_title='title',
        objective_metric_series='series',
        objective_metric_sign='min',
        max_number_of_concurrent_tasks=5,
        optimizer_class=RandomSearch,
        execution_queue='workers', time_limit_per_job=120, pool_period_min=0.2)
    
    # This will automatically create and print the optimizer new task id
    # for later use. if a Task was already created, it will use it.
    an_optimizer.set_time_limit(in_minutes=10.)
    an_optimizer.start()
    # we can create a pooling loop if we like
    while not an_optimizer.reached_time_limit():
        top_exp = an_optimizer.get_top_experiments(top_k=3)
        print(top_exp)
    # wait until optimization completed or timed-out
    an_optimizer.wait()
    # make sure we stop all jobs
    an_optimizer.stop()
    

get_num_active_experiments()

Return the number of current active experiments.

Returns

The number of active experiments.

get_active_experiments()

Return a list of Tasks of the current active experiments.

Returns

A list of Task objects, representing the current active experiments.

start(job_complete_callback=None)

Start the HyperParameterOptimizer controller. If the calling process is stopped, then the controller stops as well.

Parameters

job_complete_callback (Callable) –

Callback function, called when a job is completed.

def job_complete_callback(
    job_id,                 # type: str
    objective_value,        # type: float
    objective_iteration,    # type: int
    job_parameters,         # type: dict
    top_performance_job_id  # type: str
):
    pass

Returns

True, if the controller started. False, if the controller did not start.

stop(timeout=None)

Stop the HyperParameterOptimizer controller and the optimization thread.

Parameters

timeout (float) – Wait timeout for the optimization thread to exit (minutes). The default is None, indicating do not wait terminate immediately.

is_active()

Is the optimization procedure active (still running)?

The values are:

  • True - The optimization procedure is active (still running).

  • False - The optimization procedure is not active (not still running).

Note

If the daemon thread has not yet started, is_active returns True.

Returns

A boolean indicating whether the optimization procedure is active (still running) or stopped.

is_running()

Is the optimization controller is running?

The values are:

  • True - The optimization procedure is running.

  • False - The optimization procedure is running.

Returns

A boolean indicating whether the optimization procedure is active (still running) or stopped.

wait(timeout=None)

Wait for the optimizer to finish.

Note

This method does not stop the optimizer. Call stop() to terminate the optimizer.

Parameters

timeout (float) – The timeout to wait for the optimization to complete (minutes). If None, then wait until we reached the timeout, or optimization completed.

Returns

True, if the optimization finished. False, if the optimization timed out.

set_time_limit(in_minutes=None, specific_time=None)

Set a time limit for the HyperParameterOptimizer controller. If we reached the time limit, stop the optimization process. If specific_time is provided, use it; otherwise, use the in_minutes.

Parameters
  • in_minutes (float) – The maximum processing time from current time (minutes).

  • specific_time (datetime) – The specific date/time limit.

get_time_limit()

Return the controller optimization time limit.

Returns

The absolute datetime limit of the controller optimization process.

elapsed()

Return minutes elapsed from controller stating time stamp.

Returns

The minutes from controller start time. A negative value means the process has not started yet.

reached_time_limit()

Did the optimizer reach the time limit?

The values are:

  • True - The time limit passed.

  • False - The time limit did not pass.

This method returns immediately, it does not wait for the optimizer.

Returns

True, if optimizer is running and we passed the time limit, otherwise returns False.

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

Parameters

top_k (int) – The number of Tasks (experiments) to return.

Returns

A list of Task objects, ordered by performance, where index 0 is the best performing Task.

get_optimizer()

Return the currently used optimizer object.

Returns

The SearchStrategy object used.

set_default_job_class(job_class)

Set the Job class to use when the optimizer spawns new Jobs.

Parameters

job_class (TrainsJob) – The Job Class type.

set_report_period(report_period_minutes)

Set reporting period for the accumulated objective report (minutes). This report is sent on the Optimizer Task, and collects the Objective metric from all running jobs.

Parameters

report_period_minutes (float) – The reporting period (minutes). The default is once every 10 minutes.

class trains.automation.job.TrainsJob

Create a new Task based in a base_task_id with a different set of parameters

Parameters
  • base_task_id (str) – base task id to clone from

  • parameter_override (dict) – dictionary of parameters and values to set fo the cloned task

  • task_overrides (dict) – Task object specific overrides

  • tags (list) – additional tags to add to the newly cloned task

  • parent (str) – Set newly created Task parent task field, default: base_tak_id.

  • kwargs (dict) – additional Task creation parameters

get_metric(title, series)

Retrieve a specific scalar metric from the running Task.

Parameters
  • title (str) – Graph title (metric)

  • series (str) – Series on the specific graph (variant)

Returns

A tuple of min value, max value, last value

launch(queue_name=None)

Send Job for execution on the requested execution queue

Parameters

queue_name (str) –

abort()

Abort currently running job (can be called multiple times)

elapsed()

Return the time in seconds since job started. Return -1 if job is still pending

Returns

Seconds from start.

iterations()

Return the last iteration value of the current job. -1 if job has not started yet

Returns

Task last iteration.

task_id()

Return the Task id.

Returns

The Task ID.

status()

Return the Job Task current status, see Task.TaskStatusEnum

Returns

Task status Task.TaskStatusEnum in string.

wait(timeout=None, pool_period=30.0)

Wait until the task is fully executed (i.e., aborted/completed/failed)

Parameters
  • timeout – maximum time (minutes) to wait for Task to finish

  • pool_period – check task status every pool_period seconds

Returns

True, if Task finished.

get_console_output(number_of_reports=1)

Return a list of console outputs reported by the Task. Returned console outputs are retrieved from the most updated console outputs.

Parameters

number_of_reports (int) – number of reports to return, default 1, the last (most updated) console output

Returns

List of strings each entry corresponds to one report.

worker()

Return the current worker id executing this Job. If job is pending, returns None

Returns

ID of the worker executing / executed the job, or None if job is still pending.

is_running()

Return True, if job is currently running (pending is considered False)

Returns

True, if the task is currently in progress.

is_stopped()

Return True, if job is has executed and is not any more

Returns

True the task is currently one of these states, stopped / completed / failed / published.

is_failed()

Return True, if job is has executed and failed

Returns

True the task is currently in failed state

is_pending()

Return True, if job is waiting for execution

Returns

True the task is currently is currently queued.

started()

Return True, if job already started, or ended. False, if created/pending.

Returns

False, if the task is currently in draft mode or pending.

class trains.automation.parameters.RandomSeed

The base class controlling random sampling for every optimization strategy.

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

class trains.automation.parameters.Parameter(name)

The base hyper-parameter optimization object.

Create a new Parameter for hyper-parameter optimization

Parameters

name (str) – The new Parameter name. This is the parameter name that will be passed to a Task.

get_value()

Return a dict with the Parameter name and a sampled value for the Parameter.

Returns

For example:

{'answer': 0.42}

to_list()

Return a list of all the valid values of the Parameter.

Returns

List of dicts {name: value}

to_dict()

Return a dict representation of the Parameter object. Used for serialization of the Parameter object.

Returns

dict representation of the object (serialization).

classmethod from_dict(a_dict)

Construct Parameter object from a dict representation (deserialize from dict).

Returns

The Parameter object.

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.

class trains.automation.parameters.UniformParameterRange(name, min_value, max_value, step_size=None, include_max_value=True)

Uniform randomly sampled hyper-parameter object.

Create a parameter to be sampled by the SearchStrategy

Parameters
  • name (str) – The parameter name. Match the Task hyper-parameter name.

  • min_value (float) – The minimum sample to use for uniform random sampling.

  • max_value (float) – The maximum sample to use for uniform random sampling.

  • step_size (float) – If not None, set step size (quantization) for value sampling.

  • include_max_value (bool) –

    Range includes the max_value?

    The values are:

    • True - The range includes the max_value (Default)

    • False - Does not include.

get_value()

Return uniformly sampled value based on object sampling definitions.

Returns

{self.name: random value [self.min_value, self.max_value)}

to_list()

Return a list of all the valid values of the Parameter. If self.step_size is not defined, return 100 points between min/max values.

Returns

list of dicts {name: float}

classmethod from_dict(a_dict)

Construct Parameter object from a dict representation (deserialize from dict).

Returns

The Parameter object.

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.

to_dict()

Return a dict representation of the Parameter object. Used for serialization of the Parameter object.

Returns

dict representation of the object (serialization).

class trains.automation.parameters.UniformIntegerParameterRange(name, min_value, max_value, step_size=1, include_max_value=True)

Uniform randomly sampled integer Hyper-Parameter object.

Create a parameter to be sampled by the SearchStrategy.

Parameters
  • name (str) – The parameter name. Match the task hyper-parameter name.

  • min_value (int) – The minimum sample to use for uniform random sampling.

  • max_value (int) – The maximum sample to use for uniform random sampling.

  • step_size (int) – The default step size is 1.

  • include_max_value (bool) –

    Range includes the max_value?

    The values are:

    • True - Includes the max_value (Default)

    • False - Does not include.

get_value()

Return uniformly sampled value based on object sampling definitions.

Returns

{self.name: random value [self.min_value, self.max_value)}

to_list()

Return a list of all the valid values of the Parameter. If self.step_size is not defined, return 100 points between minmax values.

Returns

list of dicts {name: int}

classmethod from_dict(a_dict)

Construct Parameter object from a dict representation (deserialize from dict).

Returns

The Parameter object.

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.

to_dict()

Return a dict representation of the Parameter object. Used for serialization of the Parameter object.

Returns

dict representation of the object (serialization).

class trains.automation.parameters.DiscreteParameterRange(name, values=())

Discrete randomly sampled hyper-parameter object.

Uniformly sample values form a list of discrete options.

Parameters
  • name (str) – The parameter name. Match the task hyper-parameter name.

  • values (list) – The list/tuple of valid parameter values to sample from.

get_value()

Return uniformly sampled value from the valid list of values.

Returns

{self.name: random entry from self.value}

to_list()

Return a list of all the valid values of the Parameter.

Returns

list of dicts {name: value}

classmethod from_dict(a_dict)

Construct Parameter object from a dict representation (deserialize from dict).

Returns

The Parameter object.

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.

to_dict()

Return a dict representation of the Parameter object. Used for serialization of the Parameter object.

Returns

dict representation of the object (serialization).

class trains.automation.parameters.ParameterSet(parameter_combinations=())

Discrete randomly sampled Hyper-Parameter object.

Uniformly sample values form a list of discrete options (combinations) of parameters.

Parameters

parameter_combinations (list) –

The list/tuple of valid parameter combinations.

For example, two combinations with three specific parameters per combination:

[ {'opt1': 10, 'arg2': 20, 'arg2': 30},
  {'opt2': 11, 'arg2': 22, 'arg2': 33}, ]

Two complex combination each one sampled from a different range:

[ {'opt1': UniformParameterRange('arg1',0,1) , 'arg2': 20},
  {'opt2': UniformParameterRange('arg1',11,12), 'arg2': 22},]

get_value()

Return uniformly sampled value from the valid list of values.

Returns

{self.name: random entry from self.value}

to_list()

Return a list of all the valid values of the Parameter.

Returns

list of dicts {name: value}

classmethod from_dict(a_dict)

Construct Parameter object from a dict representation (deserialize from dict).

Returns

The Parameter object.

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.

to_dict()

Return a dict representation of the Parameter object. Used for serialization of the Parameter object.

Returns

dict representation of the object (serialization).

class trains.automation.hpbandster.bandster.OptimizerBOHB(base_task_id, hyper_parameters, objective_metric, execution_queue, num_concurrent_workers, min_iteration_per_job, max_iteration_per_job, total_max_jobs, pool_period_min=2.0, time_limit_per_job=None, local_port=9090, **bohb_kwargs)

Initialize a BOHB search strategy optimizer BOHB performs robust and efficient hyperparameter optimization at scale by combining the speed of Hyperband searches with the guidance and guarantees of convergence of Bayesian Optimization. Instead of sampling new configurations at random, BOHB uses kernel density estimators to select promising candidates.

For reference:

@InProceedings{falkner-icml-18,
  title =        {{BOHB}: Robust and Efficient Hyperparameter Optimization at Scale},
  author =       {Falkner, Stefan and Klein, Aaron and Hutter, Frank},
  booktitle =    {Proceedings of the 35th International Conference on Machine Learning},
  pages =        {1436--1445},
  year =         {2018},
}
Parameters
  • base_task_id (str) – Task ID (str)

  • hyper_parameters (list) – list of Parameter objects to optimize over

  • objective_metric (Objective) – Objective metric to maximize / minimize

  • execution_queue (str) – execution queue to use for launching Tasks (experiments).

  • num_concurrent_workers (int) – Limit number of concurrent running Tasks (machines)

  • min_iteration_per_job (int) – minimum number of iterations for a job to run. ‘iterations’ are the reported iterations for the specified objective, not the maximum reported iteration of the Task.

  • max_iteration_per_job (int) – number of iteration per job ‘iterations’ are the reported iterations for the specified objective, not the maximum reported iteration of the Task.

  • total_max_jobs (int) – total maximum job for the optimization process. Must be provided in order to calculate the total budget for the optimization process. The total budget is measured by “iterations” (see above) and will be set to max_iteration_per_job * total_max_jobs This means more than total_max_jobs could be created, as long as the cumulative iterations (summed over all created jobs) will not exceed max_iteration_per_job * total_max_jobs

  • pool_period_min (float) – time in minutes between two consecutive pools

  • time_limit_per_job (float) – Optional, maximum execution time per single job in minutes, when time limit is exceeded job is aborted

  • local_port (int) – default port 9090 tcp, this is a must for the BOHB workers to communicate, even locally.

  • bohb_kwargs – arguments passed directly yo the BOHB object

set_optimization_args(eta=3, min_budget=None, max_budget=None, min_points_in_model=None, top_n_percent=15, num_samples=None, random_fraction=0.3333333333333333, bandwidth_factor=3, min_bandwidth=0.001)

Defaults copied from BOHB constructor, see details in BOHB.__init__

BOHB performs robust and efficient hyperparameter optimization at scale by combining the speed of Hyperband searches with the guidance and guarantees of convergence of Bayesian Optimization. Instead of sampling new configurations at random, BOHB uses kernel density estimators to select promising candidates.

For reference:

@InProceedings{falkner-icml-18,
  title =        {{BOHB}: Robust and Efficient Hyperparameter Optimization at Scale},
  author =       {Falkner, Stefan and Klein, Aaron and Hutter, Frank},
  booktitle =    {Proceedings of the 35th International Conference on Machine Learning},
  pages =        {1436--1445},
  year =         {2018},
}
etafloat (3)

In each iteration, a complete run of sequential halving is executed. In it, after evaluating each configuration on the same subset size, only a fraction of 1/eta of them ‘advances’ to the next round. Must be greater or equal to 2.

min_budgetfloat (0.01)

The smallest budget to consider. Needs to be positive!

max_budgetfloat (1)

The largest budget to consider. Needs to be larger than min_budget! The budgets will be geometrically distributed

\(a^2 + b^2 = c^2 /sim /eta^k\) for \(k/in [0, 1, ... , num/_subsets - 1]\).

min_points_in_model: int (None)

number of observations to start building a KDE. Default ‘None’ means dim+1, the bare minimum.

top_n_percent: int (15)

percentage ( between 1 and 99, default 15) of the observations that are considered good.

num_samples: int (64)

number of samples to optimize EI (default 64)

random_fraction: float (1/3.)

fraction of purely random configurations that are sampled from the prior without the model.

bandwidth_factor: float (3.)

to encourage diversity, the points proposed to optimize EI, are sampled from a ‘widened’ KDE where the bandwidth is multiplied by this factor (default: 3)

min_bandwidth: float (1e-3)

to keep diversity, even when all (good) samples have the same value for one of the parameters, a minimum bandwidth (Default: 1e-3) is used instead of zero.

start()

Start the Optimizer controller function loop() If the calling process is stopped, the controller will stop as well.

Important

This function returns only after optimization is completed or stop() was called.

stop()

Stop the current running optimization loop, Called from a different thread than the start().

create_job()

Abstract helper function. Implementation is not required. Default use in process_step default implementation Create a new job if needed. return the newly created job. If no job needs to be created, return None.

Returns

A Newly created TrainsJob object, or None if no TrainsJob created.

get_created_jobs_ids()

Return a Task IDs dict created by this optimizer until now, including completed and running jobs. The values of the returned dict are the parameters used in the specific job

Returns

dict of task IDs (str) as keys, and their parameters dict as values.

get_objective_metric()

Return the metric title, series pair of the objective.

Returns

(title, series)

static get_random_seed()

Get the global seed for all hyper-parameter strategy random number sampling.

Returns

The random seed.

get_running_jobs()

Return the current running TrainsJobs.

Returns

List of TrainsJob objects.

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

Parameters

top_k (int) – The number of Tasks (experiments) to return.

Returns

A list of Task objects, ordered by performance, where index 0 is the best performing Task.

helper_create_job(base_task_id, parameter_override=None, task_overrides=None, tags=None, parent=None, **kwargs)

Create a Job using the specified arguments, TrainsJob for details.

Returns

A newly created Job instance.

monitor_job(job)

Helper function, Implementation is not required. Default use in process_step default implementation. Check if the job needs to be aborted or already completed.

If returns False, the job was aborted / completed, and should be taken off the current job list

If there is a budget limitation, this call should update self.budget.compute_time.update / self.budget.iterations.update

Parameters

job (TrainsJob) – A TrainsJob object to monitor.

Returns

False, if the job is no longer relevant.

process_step()

Abstract helper function. Implementation is not required. Default use in start default implementation Main optimization loop, called from the daemon thread created by start().

  • Call monitor job on every TrainsJob in jobs:

    • Check the performance or elapsed time, and then decide whether to kill the jobs.

  • Call create_job:

    • Check if spare job slots exist, and if they do call create a new job based on previous tested experiments.

Returns

True, if continue the optimization. False, if immediately stop.

set_job_class(job_class)

Set the class to use for the helper_create_job() function.

Parameters

job_class (TrainsJob) – The Job Class type.

set_job_default_parent(job_parent_task_id)

Set the default parent for all Jobs created by the helper_create_job() method.

Parameters

job_parent_task_id (str) – The parent Task ID.

set_job_naming_scheme(naming_function)

Set the function used to name a newly created job.

Parameters

naming_function (callable) –

naming_functor(base_task_name, argument_dict) -> str

static set_random_seed(seed=1337)

Set global seed for all hyper-parameter strategy random number sampling.

Parameters

seed (int) – The random seed.