Deploying Trains Server: Kubernetes Using Helm
Trains is now ClearML
This documentation applies to the legacy Trains versions. For the latest documentation, see ClearML.
Prerequisites
- A Kubernetes cluster.
kubectl
is installed and configured (see Install and Set Up kubectl in the Kubernetes documentation).helm
is installed (see Installing Helm in the Helm documentation).- One node labeled
app=trains
.
Trains Server deployment uses node storage
If more than one node is labeled as app=trains
, and you redeploy or update later, then Trains Server may not locate all your data.
Deploying
Securing deployment
By default, Trains Server deploys as an open network. To restrict Trains Server access, follow the instructions in the Network and security section, on the "Configuring Trains Server" page.
Step 1: Modify Elasticsearch default values in the Docker configuration file
Before deploying Trains Server in a Kubernetes cluster, you must modify several Elasticsearch settings in the Docker configuration. For more information, see Install Elasticsearch with Docker in the Elasticsearch documentation and Daemon configuration file in the Docker documentation.
To modify Elasticsearch default values in your Docker configuration file:
- Connect to the node in the Kubernetes cluster that you labeled
app=trains
. -
If your system contains a
/etc/sysconfig/docker
Docker configuration file, then edit it and add the options in quotes to the available arguments in theOPTIONS
section:OPTIONS="--default-ulimit nofile=1024:65536 --default-ulimit memlock=-1:-1"
-
If your system does not contain a
/etc/sysconfig/docker
Docker configuration file, then create or edit a/etc/docker/daemon.json
file and add or modify thedefaults-ulimits
section as the following example shows:{ "default-ulimits": { "nofile": { "name": "nofile", "hard": 65536, "soft": 1024 }, "memlock": { "name": "memlock", "soft": -1, "hard": -1 } } }
-
Elasticsearch requires that the
vm.max_map_count
kernel setting, which is the maximum number of memory map areas a process can use, is set to at least262144
.For CentOS 7, Ubuntu 16.04, Mint 18.3, Ubuntu 18.04 and Mint 19.x, we tested the following commands to set
vm.max_map_count
:echo "vm.max_map_count=262144" > /tmp/99-trains.conf sudo mv /tmp/99-trains.conf /etc/sysctl.d/99-trains.conf sudo sysctl -w vm.max_map_count=262144
-
Restart docker:
sudo service docker restart
Step 2. Deploy Trains Server in the Kubernetes using Helm
After modifying several Elasticsearch settings in your Docker configuration (see Step 1), you can deploy Trains Server.
To deploy Trains Server in Kubernetes using Helm:
-
Add the trains-server repository to your Helm:
helm repo add allegroai https://allegroai.github.io/trains-server-helm/
-
Confirm the trains-server repository is now in Helm:
helm search trains
The helm search results must include
allegroai/trains-server-chart
. -
Install
trains-server-chart
on your cluster:helm install allegroai/trains-server-chart --namespace=trains --name trains-server
A trains
namespace
is created in your cluster and trains-server is deployed in it.
Port Mapping
After Trains Server is deployed, the services expose the following:
- API server on
30008
. - Web server on
30080
. - File server on
30081
.
The node ports map to the following container ports:
30080
maps totrains-webserver
container on port8080
30008
maps totrains-apiserver
container on port8008
30081
maps totrains-fileserver
container on port8081
We recommend using the container ports (8080
, 8008
, and 8081
), or a load balancer (see the next section, Accessing Trains Server).
Accessing Trains Server
To access your Trains Server:
-
Create a load balancer and domain with records pointing to Trains Server using the following rules which Trains uses to translate domain names:
- The record to access the Trains Web-App (UI):
*app.<your domain name>.*
For example,
trainsapp.mydomainname.com
points to your node on port30080
.- The record to access the Trains API:
*api.<your domain name>.*
For example,
trainsapi.mydomainname.com
points to your node on port30008
.-
The record to access the Trains file server:
*files.<your domain name>.*
For example,
trainsfiles.mydomainname.com
points to your node on port30081
.
- The record to access the Trains Web-App (UI):
Upgrading
Use the current release
We strongly encourage you to keep your Trains Server up to date, by upgrading to the current release.
-
Upgrade using new or upgrade values.yaml
helm upgrade trains-server allegroai/trains-server-chart -f new-values.yaml
-
If you previously deployed a Trains Server, you must first delete old deployments using the following command:
helm delete --purge trains-server
-
If you are upgrading from Trains Server version 0.15 or older, a data migration is required before you upgrade. First follow these data migration instructions, and then continue this upgrade.
-
Upgrade your deployment to match repository version.
helm upgrade trains-server allegroai/trains-server-chart