ClearML Server

The ClearML Server is the backend service infrastructure for ClearML. It allows multiple users to collaborate and manage their experiments by working seamlessly with the ClearML Python Package and ClearML Agent. ClearML Server is composed of the following:

  • Web server including the ClearML Web UI which is the user interface for tracking, comparing, and managing experiments.

  • API server which a RESTful API for:

    • Documenting and logging experiments, including information, statistics and results.

    • Querying experiments history, logs and results.

  • File server which stores media and models making them easily accessible using the ClearML Web UI.

The ClearML Hosted Service is the ClearML Server maintained for you. For detailed information about self-hosting the ClearML Server, see Deploying ClearML Server.

image

ClearML Web UI is the ClearML user interface and is part of ClearML Server.

Use the ClearML Web UI to:

  • track experiments

  • compare experiments

  • manage experiments

For detailed information about the ClearML Web UI, see User Interface.

ClearML Agent services container

As of ClearML Server version 0.15, the dockerized deployment includes a ClearML Agent services container which runs as part of the Docker container collection. The ClearML Agent services container can work in conjunction with ClearML Agent services mode (see services-mode on the “ClearML Agent Reference” page, and Launching ClearML Agent in services mode on the “ClearML Agent Use Case Examples” examples page).

ClearML Agent services mode will spin any Task enqueued into the dedicated services queue. Each Task launched in its own container will be registered as a new node in the system, providing tracking and transparency capabilities. This provides the ability to launch long-lasting jobs which previously had to be executed on local / dedicated machines. It allows a single agent to launch multiple dockers (Tasks) for different use cases. For example, use ClearML Agent Services for an auto-scaler service (spinning instances when the need arises, and the budget permits), a controller (implementing pipelines and more sophisticated DevOps logic), an optimizer (such as hyperparameter optimization or sweeping), and an application (such as interactive Bokeh apps for increased data transparency).

Warning

Do not enqueue training or inference Tasks into the services queue. They will put an unnecessary load on the server.