Deploying ClearML Server: Kubernetes

Important

This documentation page applies to deploying your own open source ClearML Server. It does not apply to ClearML Hosted Service users.

This page describes the prerequisites and procedures for deploying ClearML Server to Kubernetes clusters using manual instructions, as well accessing ClearML Server, upgrading, and port mappings.

To deploy ClearML Server to Kubernetes using Helm, see Deploying ClearML Server: Kubernetes using Helm.

Warning

If you are reinstalling ClearML Server, we recommend clearing your browser cookies for ClearML Server. For example, go to Developer Tools > Storage > Cookies (Firefox), Developer Tools > Application > Cookies (Chrome) and deleting all cookies under the ClearML Server URL.

Prerequisites

  • A Kubernetes cluster.

  • kubectl is installed and configured (see Install and Set Up kubectl in the Kubernetes documentation).

  • One node labeled app=clearml.

Warning

ClearML Server deployment uses node storage. If more than one node is labeled as app=clearml, and you redeploy or update later, then ClearML Server may not locate all your data.

Deploying

Warning

By default, ClearML Server with unrestricted access. To restrict ClearML Server access, follow the instructions in the Securing Your Own ClearML Server page.

Step 1: Modify Elasticsearch default values in the Docker configuration file

Before deploying ClearML Server in a Kubernetes cluster, you must modify several Elasticsearch settings in the Docker configuration. For more information, see Install Elasticsearch with Docker in the Elasticsearch documentation and Daemon configuration file in the Docker documentation.

To modify Elasticsearch default values in your Docker configuration file:

  1. Connect to the node in the Kubernetes cluster that you labeled app=clearml.

  2. Create or edit (if one exists) the /etc/docker/daemon.json file, and add or modify the defaults-ulimits section as the following example shows:

     {
         "default-ulimits": {
             "nofile": {
                 "name": "nofile",
                 "hard": 65536,
                 "soft": 1024
             },
             "memlock":
             {
                 "name": "memlock",
                 "soft": -1,
                 "hard": -1
             }
         }
     }
    
  3. Elasticsearch requires that the vm.max_map_count kernel setting, which is the maximum number of memory map areas a process can use, is set to at least 262144.

    For CentOS 7, Ubuntu 16.04, Mint 18.3, Ubuntu 18.04 and Mint 19.x, we tested the following commands to set vm.max_map_count:

     echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
     sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
     sudo sysctl -w vm.max_map_count=262144
    
  4. Restart docker:

     sudo service docker restart
    

Step 2. Deploy ClearML Server in the Kubernetes Cluster

After modifying several Elasticsearch settings in your Docker configuration (see Step 1 below), you can deploy ClearML Server.

To deploy ClearML Server in your Kubernetes Clusters:

  1. Clone the clearml-server-k8s repository and change to the new clearml-server-k8s directory:

     git clone https://github.com/allegroai/clearml-server-k8s.git && cd clearml-server-k8s/clearml-server-k8s
    
  2. Create the clearml namespace and deployments:

     kubectl apply -k overlays/current_version 
    

    Note

    This installs the templates for the current clearml-server version and update patch versions whenever the deployment is restarted (or reinstalled).

     To use use the latest version, which **_not recommended_**:
    
         kubectl apply -k base
    

Port Mapping

After deploying ClearML Server, the services expose the following node ports:

  • API server on 30008.

  • Web server on 30080.

  • File server on 30081.

Accessing ClearML Server

To access your ClearML Server, do the following:

  1. Create domain records.

    • Create three records for the ClearML Server web server, file server, and API access using the following rules:

      • app.<your domain name>

      • files.<your domain name>

      • api.<your domain name>

    For example:

    • app.clearml.mydomainname.com

    • files.clearml.mydomainname.com

    • api.clearml.mydomainname.com

  2. Point the records you created to the load balancer.

  3. Configure the load balancer to redirect traffic coming from the records you created:

    • app.<your domain name> should be redirected to k8s cluster nodes on port 30080

    • files.<your domain name> should be redirected to k8s cluster nodes on port 30081

    • api.<your domain name> should be redirected to k8s cluster nodes on port 30008

Upgrading

Note

We strongly encourage you to keep your ClearML Server up to date, by upgrading to the current release.

To update your deployment, you must edit the yaml file you want to update and then run following command:

To update your ClearML Server in Kubernetes clusters:

  1. If you previously deployed a ClearML Server, you must first delete old deployments using the following command:

     kubectl delete -f .
    
  2. If you are upgrading from Trains Server version 0.15 or older to ClearML Server, a data migration is required before you upgrade. First follow these data migration instructions, and then continue this upgrade.

  3. Edit the YAML file you want to update and then run following command:

     kubectl apply -f <file you edited>.yaml