trains.task

class trains.task.Task(private=None, **kwargs)

Task (experiment) object represents the current running experiments and connects all the different parts into a fully reproducible experiment

Common usage is calling Task.init() to initialize the main task. The main task is development / remote execution mode-aware, and supports connecting various SDK objects such as Models etc. In development mode, the main task supports task reuse (see Task.init() for more information in development mode features). Any subsequent call to Task.init() will return the already-initialized main task and will not create a new main task.

Sub-tasks, meaning tasks which are not the main task and are not development / remote execution mode aware, can be created using Task.create(). These tasks do no support task reuse and any call to Task.create() will always create a new task.

You can also query existing tasks in the system by calling Task.get_task().

Usage: Task.init() or Task.get_task()

Do not construct Task manually!

Please use Task.init() or Task.get_task(id=, project=, name=)

classmethod current_task()

Return the Current Task object for the main execution task (task context). :return: Task() object or None

classmethod init(project_name=None, task_name=None, task_type=<TaskTypes.training: 'training'>, reuse_last_task_id=True, output_uri=None, auto_connect_arg_parser=True, auto_connect_frameworks=True, auto_resource_monitoring=True)

Return the Task object for the main execution task (task context).

Parameters
  • project_name – project to create the task in (if project doesn’t exist, it will be created)

  • task_name – task name to be created (in development mode, not when running remotely)

  • task_type – task type to be created, Default: TaskTypes.training Options are: ‘testing’, ‘training’ or ‘train’, ‘inference’

  • reuse_last_task_id – start with the previously used task id (stored in the data cache folder). if False every time we call the function we create a new task with the same name Notice! The reused task will be reset. (when running remotely, the usual behaviour applies) If reuse_last_task_id is of type string, it will assume this is the task_id to reuse! Note: A closed or published task will not be reused, and a new task will be created.

  • output_uri

    Default location for output models (currently support folder/S3/GS/Azure ). notice: sub-folders (task_id) is created in the destination folder for all outputs.

    Usage example: /mnt/share/folder, s3://bucket/folder , gs://bucket-name/folder, azure://company.blob.core.windows.net/folder/

    Note: When using cloud storage, make sure you install the accompany packages. For example: trains[s3], trains[gs], trains[azure]

  • auto_connect_arg_parser – Automatically grab the ArgParser and connect it with the task. if set to false, you can manually connect the ArgParser with task.connect(parser)

  • auto_connect_frameworks

    If True automatically patch MatplotLib, XGBoost, scikit-learn, Keras callbacks, and TensorBoard/X to serialize plots, graphs and model location to trains backend (in addition to original output destination). Fine grained control is possible by passing a dictionary instead of a Boolean. Missing keys are considered to have True value, empty dictionary is considered as False, full example:

    auto_connect_frameworks={‘matplotlib’: True, ‘tensorflow’: True, ‘pytorch’: True,

    ’xgboost’: True, ‘scikit’: True}

  • auto_resource_monitoring – If true, machine vitals will be sent along side the task scalars, Resources graphs will appear under the title ‘:resource monitor:’ in the scalars tab.

Returns

Task() object

classmethod create(project_name=None, task_name=None, task_type=<TaskTypes.training: 'training'>)

Create a new Task object, regardless of the main execution task (Task.init).

Notice: This function will always create a new task, whether running in development or remote execution mode.

Parameters
  • project_name – Project to create the task in. If project is None, and the main execution task is initialized (Task.init), its project will be used. If project is provided but doesn’t exist, it will be created.

  • task_name – task name to be created

  • task_type – Task type to be created. (default: “training”) Optional Task types are: “training” / “testing” / “dataset_import” / “annotation” / “annotation_manual”

Returns

Task() object

classmethod get_task(task_id=None, project_name=None, task_name=None)

Returns Task object based on either, task_id (system uuid) or task name

Parameters
  • task_id (str) – unique task id string (if exists other parameters are ignored)

  • project_name (str) – project name (str) the task belongs to

  • task_name (str) – task name (str) in within the selected project

Returns

Task object

classmethod get_tasks(task_ids=None, project_name=None, task_name=None)

Returns a list of Task objects, matching requested task name (or partially matching)

Parameters
  • task_ids (list(str)) – list of unique task id string (if exists other parameters are ignored)

  • project_name (str) – project name (str) the task belongs to (use None for all projects)

  • task_name (str) – task name (str) in within the selected project Return any partial match of task_name, regular expressions matching is also supported If None is passed, returns all tasks within the project

Returns

list of Task object

property artifacts

read-only dictionary of Task artifacts (name, artifact) :return: dict

classmethod clone(source_task=None, name=None, comment=None, parent=None, project=None)

Clone a task object, create a copy a task.

Parameters
  • source_task (Task/str) – Source Task object (or ID) to be cloned

  • name (str) – Optional, New for the new task

  • comment (str) – Optional, comment for the new task

  • parent (str) – Optional parent Task ID of the new task. If None, parent will be set to source_task.parent, or if not available to source_task itself.

  • project (str) – Optional project ID of the new task. If None, the new task will inherit the cloned task’s project.

Returns

a new cloned Task object

classmethod enqueue(task, queue_name=None, queue_id=None)

Enqueue (send) a task for execution, by adding it to an execution queue

Parameters
  • task (Task / str) – Task object (or Task ID) to be enqueued, None if using Task object

  • queue_name (str) – Name of the queue in which to enqueue the task.

  • queue_id (str) – ID of the queue in which to enqueue the task. If not provided use queue_name.

Returns

enqueue response

classmethod dequeue(task)

Dequeue (remove) task from execution queue.

Parameters

task (Task / str) – Task object (or Task ID) to be enqueued, None if using Task object

Returns

Dequeue response

add_tags(tags)

Add tags to this task. Old tags are not deleted

In remote, this is a no-op.

Parameters

tags (str or iterable of str) – An iterable or space separated string of new tags (string) to add.

connect(mutable)

Connect an object to a task (see introduction to Task connect design)

Parameters

mutable – can be any object Task supports integrating with: - argparse : for argument passing - dict : for argument passing - TaskParameters : for argument passing - model : for initial model warmup or model update/snapshot uploads

Returns

connect_task() return value if supported

Raise

raise exception on unsupported objects

connect_configuration(configuration)

Connect a configuration dict / file (pathlib.Path / str) with the Task Connecting configuration file should be called before reading the configuration file. When an output model will be created it will include the content of the configuration dict/file

Example local file:

config_file = task.connect_configuration(config_file) my_params = json.load(open(config_file,’rt’))

Example parameter dictionary:

my_params = task.connect_configuration(my_params)

Parameters

Path/str) configuration ((dict,) – usually configuration file used in the model training process configuration can be either dict or path to local file. If dict is provided, it will be stored in json alike format (hocon) editable in the UI If pathlib2.Path / string is provided the content of the file will be stored Notice: local path must be relative path (and in remote execution, the content of the file will be overwritten with the content brought from the UI)

Returns

configuration object If dict was provided, a dictionary will be returned If pathlib2.Path / string was provided, a path to a local configuration file is returned

connect_label_enumeration(enumeration)

Connect a label enumeration dictionary with the Task

When an output model is created it will store the model label enumeration dictionary

Parameters

enumeration (dict) – dictionary of string to integer, enumerating the model output integer to labels example: {‘background’: 0 , ‘person’: 1}

Returns

enumeration dict

get_logger()

get a logger object for reporting, for this task context. All reports (metrics, text etc.) related to this task are accessible in the web UI

Returns

Logger object

mark_started()

Manually Mark the task as started (will happen automatically)

mark_stopped()

Manually Mark the task as stopped (also used in self._at_exit)

flush(wait_for_uploads=False)

flush any outstanding reports or console logs

Parameters

wait_for_uploads – if True the flush will exit only after all outstanding uploads are completed

reset(set_started_on_success=False, force=False)

Reset the task. Task will be reloaded following a successful reset.

Notice: when running remotely the task will not be reset (as it will clear all logs and metrics)

Parameters
  • set_started_on_success – automatically set started if reset was successful

  • force – force task reset even if running remotely

close()

Close the current Task. Enables to manually shutdown the task. Should only be called if you are absolutely sure there is no need for the Task.

register_artifact(name, artifact, metadata=None, uniqueness_columns=True)

Add artifact for the current Task, used mostly for Data Audition. Currently supported artifacts object types: pandas.DataFrame

Parameters
  • name (str) – name of the artifacts. Notice! it will override previous artifacts if name already exists.

  • artifact (pandas.DataFrame) – artifact object, supported artifacts object types: pandas.DataFrame

  • metadata (dict) – dictionary of key value to store with the artifact (visible in the UI)

  • uniqueness_columns (Sequence) – Sequence of columns for artifact uniqueness comparison criteria. The default value is True, which equals to all the columns (same as artifact.columns).

unregister_artifact(name)

Remove artifact from the watch list. Notice this will not remove the artifacts from the Task. It will only stop monitoring the artifact, the last snapshot of the artifact will be taken immediately in the background.

get_registered_artifacts()

dictionary of Task registered artifacts (name, artifact object) Notice these objects can be modified, changes will be uploaded automatically

Returns

dict

upload_artifact(name, artifact_object, metadata=None, delete_after_upload=False)

Add static artifact to Task. Artifact file/object will be uploaded in the background Raise ValueError if artifact_object is not supported

Parameters
  • name (str) – Artifact name. Notice! it will override previous artifact if name already exists

  • artifact_object (object) –

    Artifact object to upload. Currently supports: - string / pathlib2.Path are treated as path to artifact file to upload

    If wildcard or a folder is passed, zip file containing the local files will be created and uploaded

    • dict will be stored as .json file and uploaded

    • pandas.DataFrame will be stored as .csv.gz (compressed CSV file) and uploaded

    • numpy.ndarray will be stored as .npz and uploaded

    • PIL.Image will be stored to .png file and uploaded

  • metadata (dict) – Simple key/value dictionary to store on the artifact

  • delete_after_upload (bool) – If True local artifact will be deleted (only applies if artifact_object is a local file)

Returns

True if artifact will be uploaded

is_current_task()

Check if this task is the main task (returned by Task.init())

NOTE: This call is deprecated. Please use Task.is_main_task()

If Task.init() was never called, this method will not create it, making this test cheaper than Task.init() == task

Returns

True if this task is the current task

is_main_task()

Check if this task is the main task (returned by Task.init())

If Task.init() was never called, this method will not create it, making this test cheaper than Task.init() == task

Returns

True if this task is the current task

set_model_config(config_text=None, config_dict=None)

Set Task model configuration text/dict (before creating an output model) When an output model is created it will inherit these properties

Parameters
  • config_text – model configuration (unconstrained text string). usually the content of a configuration file. If config_text is not None, config_dict must not be provided.

  • config_dict – model configuration parameters dictionary. If config_dict is not None, config_text must not be provided.

get_model_config_text()

Get Task model configuration text (before creating an output model) When an output model is created it will inherit these properties

Returns

model config_text (unconstrained text string). usually the content of a configuration file. If config_text is not None, config_dict must not be provided.

get_model_config_dict()

Get Task model configuration dictionary (before creating an output model) When an output model is created it will inherit these properties

Returns

model config_text (unconstrained text string). usually the content of a configuration file. If config_text is not None, config_dict must not be provided.

set_model_label_enumeration(enumeration=None)

Set Task output label enumeration (before creating an output model) When an output model is created it will inherit these properties

Parameters

enumeration – dictionary of string to integer, enumerating the model output to labels example: {‘background’: 0 , ‘person’: 1}

get_last_iteration()

Return the maximum reported iteration (i.e. the maximum iteration the task reported a metric for) Notice, this is not a cached call, it will ask the backend for the answer (no local caching)

Returns

last reported iteration number (integer)

set_last_iteration(last_iteration)

Forcefully set the last reported iteration (i.e. the maximum iteration the task reported a metric for)

Parameters

last_iteration (integer) – last reported iteration number

get_last_scalar_metrics()

Extract the last scalar metrics, ordered by title & series in a nested dictionary

Returns

dict. Example: {‘title’: {‘series’: {‘last’: 0.5, ‘min’: 0.1, ‘max’: 0.9}}}

classmethod set_credentials(api_host=None, web_host=None, files_host=None, key=None, secret=None, host=None)

Set new default TRAINS-server host and credentials These configurations will be overridden by wither OS environment variables or trains.conf configuration file

Notice! credentials needs to be set prior to Task initialization

Parameters
  • api_host (str) – Trains API server url, example: host=’http://localhost:8008

  • web_host (str) – Trains WEB server url, example: host=’http://localhost:8080

  • files_host (str) – Trains Files server url, example: host=’http://localhost:8081

  • key (str) – user key/secret pair, example: key=’thisisakey123’

  • secret (str) – user key/secret pair, example: secret=’thisisseceret123’

  • host (str) – host url, example: host=’http://localhost:8008’ (deprecated)

property cache_dir

Cache dir used to store task related files

completed(ignore_errors=True)

Signal that this task has been completed

classmethod get_all(session=None, log=None, **kwargs)

List all tasks based on specific projection

Parameters
  • session (Session) – Session object used for sending requests to the API

  • log (logging.Logger) – Log object

  • kwargs (dict) – Keyword args passed to the GetAllRequest (see .backend_api.services.tasks.GetAllRequest) Example: status=’completed’, ‘search_text’=’specific_word’, ‘user’=’user_id’, ‘project’=’project_id’

Returns

API response

get_label_num_description()

Get a dict of label number to a string representing all labels associated with this number on the model labels

get_labels_enumeration()

Return a dictionary of labels (text) to ids (integers) {str(label): integer(id)} :return: dict

get_model_design()

Returns the model configuration as blob of text :return:

get_num_of_classes()

number of classes based on the task’s labels

get_output_destination(extra_path=None, **kwargs)

Get the task’s output destination, with an optional suffix

get_parameter(name, default=None)

Get a value for a parameter.

Parameters
  • name – Parameter name

  • default – Default value

Returns

Parameter value (or default value if parameter is not defined)

property input_model

A model manager used to handle the input model object

property labels_stats

Get accumulated label stats for the current/last frames iteration

mark_failed(ignore_errors=True, status_reason=None, status_message=None)

Signal that this task has stopped

property metrics_manager

A metrics manager used to manage the metrics related to this task

property output_model

A model manager used to manage the output model object

publish(ignore_errors=True)

Signal that this task will be published

property reporter

Returns a simple metrics reporter instance

save_exec_model_design_file(filename='model_design.txt', use_cache=False)

Save execution model design to file

set_artifacts(artifacts_list=None)

List of artifacts (tasks.Artifact) to update the task

Parameters

artifacts_list (list) – list of artifacts (type tasks.Artifact)

set_comment(comment)

Set a comment text to the task.

Parameters

comment (str) – The comment of the task

set_input_model(model_id=None, model_name=None, update_task_design=True, update_task_labels=True)

Set a new input model for this task. Model must be ‘ready’ in order to be used as the Task’s input model.

Parameters
  • model_id – ID for a model that exists in the backend. Required if model_name is not provided.

  • model_name – Model name. Required if model_id is not provided. If provided, this name will be used to locate an existing model in the backend.

  • update_task_design – if True, the task’s model design will be copied from the input model

  • update_task_labels – if True, the task’s label enumeration will be copied from the input model

set_name(name)

Set a comment text to the task.

Parameters

name (str) – The name of the task

set_parameter(name, value, description=None)

Set a single task parameter. This overrides any previous value for this parameter.

Parameters
  • name – Parameter name

  • value – Parameter value

  • description – Parameter description (unused for now)

set_parameters(*args, **kwargs)

Set parameters for this task. This allows setting a complete set of key/value parameters, but does not support parameter descriptions (as the input is a dictionary or key/value pairs.

Parameters
  • args – Positional arguments (one or more dictionary or (key, value) iterable). These will be merged into a single key/value dictionary.

  • kwargs – Key/value pairs, merged into the parameters dictionary created from args.

started(ignore_errors=True)

Signal that this task has started

property status

The task’s status. In order to stay updated, we always reload the task info when this value is accessed.

stopped(ignore_errors=True)

Signal that this task has stopped

update_model_desc(new_model_desc_file=None)

Change the task’s model_desc

update_output_model(model_uri, name=None, comment=None, tags=None)

Update the task’s output model. Note that this method only updates the model’s metadata using the API and does not upload any data. Use this method to update the output model when you have a local model URI (e.g. storing the weights file locally and providing a file://path/to/file URI)

Parameters
  • model_uri (str) – URI for the updated model weights file

  • name (str) – Optional updated model name

  • comment (str) – Optional updated model description

  • tags ([str]) – Optional updated model tags

update_output_model_and_upload(model_file, name=None, comment=None, tags=None, async_enable=False, cb=None, iteration=None)

Update the task’s output model weights file. File is first uploaded to the preconfigured output destination (see task’s output.destination property or call setup_upload()), than the model object associated with the task is updated using an API call with the URI of the uploaded file (and other values provided by additional arguments)

Parameters
  • model_file (str) – Path to the updated model weights file

  • name (str) – Optional updated model name

  • comment (str) – Optional updated model description

  • tags ([str]) – Optional updated model tags

  • async_enable (bool) – Request asynchronous upload. If False, the call blocks until upload is completed and the API call updating the model returns. If True, the call returns immediately, while upload and update are scheduled in another thread. Default is False.

  • cb – Asynchronous callback. If async=True, this callback will be invoked once the asynchronous upload and update have completed.

Returns

The URI of the uploaded weights file. If async=True, this is the expected URI as the upload is probably still in progress.

update_parameters(*args, **kwargs)

Update parameters for this task.

This allows updating a complete set of key/value parameters,but does not support parameter descriptions (as the input is a dictionary or key/value pairs.

Parameters
  • args – Positional arguments (one or more dictionary or (key, value) iterable). These will be merged into a single key/value dictionary.

  • kwargs – Key/value pairs, merged into the parameters dictionary created from args.