Configuring Your Own ClearML Server

Important

This documentation page applies to deploying your own open source ClearML Server. It does not apply to ClearML Hosted Service users.

This page describes the ClearML Server deployment and feature configurations. Namely, it contains instructions on how to configure ClearML Server for:

For all configuration options, see the ClearML Reference page.

Important

We recommend using the latest version of ClearML Server.

ClearML Server Deployment Configuration

ClearML Server supports two deployment configurations: single IP (domain) and sub-domains.

Single IP (domain) configuration

Single IP (domain) with the following open ports:

  • Web application on port 8080

  • API service on port 8008

  • File storage service on port 8081

Sub-domain configuration

Sub-Domain configuration with default http/s ports (80 or 443):

  • Web application on sub-domain: app.*.*

  • API service on sub-domain: api.*.*

  • File storage service on sub-domain: files.*.*

When you configure sub-domains for ClearML Server, they will map to the ClearML Server internally configured ports for the Dockers. As a result, ClearML Server Dockers remain accessible if, for example, you implement some type of port forwarding.

Important

You must use app, api, and files as the sub-domain labels.

For example, if your domain is mydomain.com, and you create a sub-domain named clearml.mydomain.com, use the following:

  • app.clearml.mydomain.com (web server)

  • api.clearml.mydomain.com (API server)

  • files.clearml.mydomain.com (file server)

Accessing the ClearML Web UI with app.clearml.mydomain.com will automatically send API requests to api.clearml.mydomain.com.

ClearML Server Feature Configurations

ClearML Server features can be configured using either configuration files, or environment variables.

Configuration files

The ClearML Server uses the following configuration files:

  • apiserver.conf

  • hosts.conf

  • logging.conf

  • secure.conf

  • services.conf

The ClearML Server will look for these files when starting up, in the /opt/clearml/config directory (this path can be modified using the CLEARML_CONFIG_DIR environment variable). The default configuration files are in the clearml-server repository.

Note: Within the default structure, the services.conf file is represented by a subdirectory with service-specific .conf files. If you use services.conf to configure the server, any setting related to a file under the services subdirectory can simply be represented by a key within the services.conf file. For example, if you need to override multi_task_histogram_limit that appears in the default/services/tasks.conf, your services.conf file should contain:

tasks {
   "multi_task_histogram_limit": <new-value>
}

Environment variables

The ClearML Server supports several fixed environment variables that affect its behavior, as well as dynamic environment variable that can be used to override any configuration file setting.

Fixed environment variables

General

  • CLEARML_CONFIG_DIR allows overriding the default directory where the server looks for configuration files. Multiple directories can be specified (in the same format used for specifying the system’s PATH env var)

Database service overrides:

  • CLEARML_MONGODB_SERVICE_HOST allows overriding the hostname for the MongoDB service

  • CLEARML_MONGODB_SERVICE_PORT allows overriding the port for the MongoDB service

  • CLEARML_ELASTIC_SERVICE_HOST allows overriding the hostname for the ElasticSearch service

  • CLEARML_ELASTIC_SERVICE_PORT allows overriding the port for the ElasticSearch service

  • CLEARML_REDIS_SERVICE_HOST allows overriding the hostname for the Redis service

  • CLEARML_REDIS_SERVICE_PORT allows overriding the port for the Redis service

Dynamic environment variables

Dynamic environment variables can be used to override any configuration setting that appears in the configuration files.

The environment variable’s name should be CLEARML__<configuration-path>, where <configuration-path> represents the full path to the configuration field you’re setting, including the configuration file name. Elements of the configuration path should be separated by __ (double underscore).

For example, given the default secure.conf file contents:

    ...
    
    credentials {
        apiserver {
            role: "system"
            user_key: "defualt-key"
            user_secret: "default-secret"
        }
        
        ...
        
    }

You can override the default secret for the system’s apiserver component by setting the following environment variable: CLEARML__SECURE__CREDENTIALS__APISERVER__USER_SECRET="my-new-secret"

Note:

  • Since configuration fields may contain JSON-parsable values, make sure to always quote strings (otherwise the server might fail when trying to parse them)

  • In order to comply with environment variables standards, it is also recommended to only use upper-case characters in environment variable keys. For this reason, ClearML Server will always convert the configuration path specified in the dynamic environment variable’s key to lower-case before overriding configuration values with the environment variable value.

Configuration procedures

Sub-domains and load balancers

To illustrate this configuration, we provide the following example based on AWS load balancing:

  1. In your ClearML Server /opt/clearml/config/apiserver.conf file, add the following auth.cookies section:

     auth {
       cookies {
         httponly: true
         secure: true
         domain: ".clearml.mydomain.com"
         max_age: 99999999999
       }
     }
    
  2. Use the following load balancer configuration:

    • Listeners:

      • Optional: HTTP listener, that redirects all traffic to HTTPS.

      • HTTPS listener for app. forwarded to AppTargetGroup

      • HTTPS listener for api. forwarded to ApiTargetGroup

      • HTTPS listener for files. forwarded to FilesTargetGroup

    • Target groups:

      • AppTargetGroup: HTTP based target group, port 8080

      • ApiTargetGroup: HTTP based target group, port 8008

      • FilesTargetGroup: HTTP based target group, port 8081

    • Security and routing:

      • Load balancer: make sure the load balancers are able to receive traffic from the relevant IP addresses (Security groups and Subnets definitions).

      • Instances: make sure the load balancers are able to access the instances, using the relevant ports (Security groups definitions).

  3. Restart ClearML Server.

Opening Elasticsearch, MongoDB, and Redis for External Access

For improved security, the ports for ClearML Server Elasticsearch, MongoDB, and Redis servers are not exposed by default; they are only open internally in the docker network. If you need external access and understand the security risks, you can open these ports.

Warning

Opening the ports for Elasticsearch, MongoDB, and Redis for external access may pose a security concern and is not recommended unless you know what you’re doing. Network security measures, such as firewall configuration, should be considered when opening ports for external access.

To open external access to the Elasticsearch, MongoDB, and Redis ports:

  1. Shutdown ClearML Server. Executing the following command (which assumes the configuration file is in the environment path).

     docker-compose down
    
  2. Edit your docker-compose.yml file as follows:

    • In the elasticsearch section, add the two lines:

        ports:
        - "9200:9200"
      
    • In the mongo section, add the two lines:

        ports:
        - "27017:27017"
      
    • In the redis section, add the two lines:

        ports:
        - "6379:6379"
      
  3. Startup ClearML Server.

     docker-compose -f docker-compose.yml pull
     docker-compose -f docker-compose.yml up -d
    

Web Login Authentication

You can configure the ClearML Server for web login authentication which permits only those users who are provided with credentials to access your ClearML system. Those credentials are a user name and password.

Without web login authentication, ClearML Server does not restrict access (by default).

To add web login authentication to your ClearML Server:

  1. In your ClearML Server /opt/clearml/config/apiserver.conf, add the auth.fixed_users section and specify the users.

    For example:

     auth {
         # Fixed users login credentials
         # No other user will be able to login
         fixed_users {
             enabled: true
             pass_hashed: false
             users: [
                 {
                     username: "jane"
                     password: "12345678"
                     name: "Jane Doe"
                 },
                 {
                     username: "john"
                     password: "12345678"
                     name: "John Doe"
                 },
             ]
         }
     }
    
  2. Restart ClearML Server.

Using hashed passwords

You can also use hashed passwords instead of plain-text passwords. To do that:

  • Set pass_hashed: true

  • Use a base64-encoded hashed password in the password field instead of a plain-text password. Assuming Jane’s plain-text password is 123456, use the following bash command to generate the base64-encoded hashed password:

    > python3 -c 'import bcrypt,base64; print(base64.b64encode(bcrypt.hashpw("123456".encode(), bcrypt.gensalt())))'
    b'JDJiJDEyJDk3OHBFcHFlNEsxTkFoZDlPcGZsbC5sU1pmM3huZ1RpeHc0ay5WUjlzTzN5WE1WRXJrUmhp'
    
  • Use the command’s output as the user’s password. Resulting apiserver.conf file should look as follows:

     auth {
         # Fixed users login credentials
         # No other user will be able to login
         fixed_users {
             enabled: true
             pass_hashed: true
             users: [
                 {
                     username: "jane"
                     password: "JDJiJDEyJDk3OHBFcHFlNEsxTkFoZDlPcGZsbC5sU1pmM3huZ1RpeHc0ay5WUjlzTzN5WE1WRXJrUmhp"
                     name: "Jane Doe"
                 }
             ]
         }
     }
    

Non-responsive Task watchdog

The non-responsive experiment watchdog monitors experiments that were not updated for a specified time interval and then the watchdog marks them as aborted. The non-responsive experiment watchdog is always active.

You can modify the following settings for the watchdog:

  • The time threshold (in seconds) of experiment inactivity (default value is 7200 seconds (2 hours)).

  • The time interval (in seconds) between watchdog cycles.

To configure the non-responsive watchdog for your ClearML Server:

  1. In your ClearML Server /opt/clearml/config/services.conf file, add or edit the tasks.non_responsive_tasks_watchdog and specify the watchdog settings.

    For example:

     tasks {
         non_responsive_tasks_watchdog {
             # In-progress tasks that haven't been updated for at least 'value' seconds will be stopped by the watchdog
             threshold_sec: 7200
    
             # Watchdog will sleep for this number of seconds after each cycle
             watch_interval_sec: 900
         }
     }
    
  2. Restart ClearML Server.