Upgrading Your Trains Server from v0.15 or Older to ClearML Server¶
Important
This documentation page applies to deploying your own open source ClearML Server. It does not apply to ClearML Hosted Service users.
In v0.16, the Elasticsearch subsystem of Trains Server was upgraded from version 5.6 to version 7.6. This change necessitates the migration of the database contents to accommodate the change in index structure across the different versions.
This page provides the instructions to carry out the migration process. Follow this process if you are using Trains Server version 0.15 or older and are upgrading to ClearML Server.
The migration process makes use of a script that automatically performs the following:
Backup the existing Trains Server Elasticsearch data.
Launch a pair of Elasticsearch 5 and Elasticsearch 7 migration containers.
Copy the Elasticsearch indices using the migration containers.
Terminate the migration containers.
Rename the original data directory to avoid accidental reuse.
Warning
Once the migration process completes successfully, the data is no longer accessible to the older version of Trains Server, and ClearML Server needs to be installed.
Prerequisites¶
Read/write permissions for the default Trains Server data directory
/opt/clearml/data
and its subdirectories, or if you do not use this default directory, then permissions for the directory and subdirectories you do use.A minimum of 8GB system RAM.
Minimum free disk space of at least 30% plus two times the size of your data.
Python version >=2.7 or >=3.6, and Python accessible from the command-line as
python
Migrating the data¶
To migrate the data:
Shut down Trains Server, if it is up.
Linux and macOS
docker-compose -f /opt/trains/docker-compose.yml down
Windows
docker-compose -f c:\opt\trains\docker-compose-win10.yml down
Kubernetes
kubectl delete -k overlays/current_version
Kubernetes using Helm
helm del --purge trains-server kubectl delete namespace trains
For Kubernetes and Kubernetes using Helm, connect to the node in the Kubernetes cluster that you labeled
app=trains
.Download the migration package archive.
curl -L -O https://github.com/allegroai/trains-server/releases/download/0.16.0/trains-server-0.16.0-migration.zip
If you need to download the file manually, use this direct link: trains-server-0.16.0-migration.zip.
Extract the archive.
unzip trains-server-0.16.0-migration.zip -d /opt/trains
Migrate the data.
Linux, macOS, and Windows if you manage your own containers.
Run the migration script. If you use elevated privileges to run Docker (
sudo
in Linux, or admin in Windows), then use elevated privileges to run the migration script.python elastic_upgrade.py [-s|--source <source_path>] [-t|--target <target_path>] [-n|--no-backup] [-p|--parallel]
The optional command line parameters can be used to control the execution of the migration script:
<source_path>
- The path to the Elasticsearch data directory in your current Trains Server deployment.
If not specified, uses the default value of/opt/trains/data/elastic
(orc:\opt\trains\data\elastic
in Windows)<target_path>
- The path to the Elasticsearch data directory in your current Trains Server deployment.
If not specified, uses the default value of/opt/trains/data/elastic_7
(orc:\opt\trains\data\elastic_7
in Windows)no-backup
- Skip creating a backup of the existing Elasticsearch data directory before performing the migration.
If not specified, takes on the default value ofFalse
(Perform backup)parallel
- Copy several indices in parallel to utilize more CPU cores. If not specified, parallel indexing is turned off.
Kubernetes
Clone the
trains-server-k8s
repository and change to the newtrains-server-k8s/upgrade-elastic
directory:git clone https://github.com/allegroai/clearml-server-k8s.git && cd clearml-server-k8s/upgrade-elastic
Create the
upgrade-elastic
namespace and deployments:kubectl apply -k overlays/current_version
Wait for the job to be completed, to check if it’s completed you can run:
kubectl get jobs -n upgrade-elastic
Kubernetes using Helm
Add the
clearml-server
repository to your Helm client.helm repo add allegroai https://allegroai.github.io/clearml-server-helm/
Confirm the
clearml-server
repository is now in your Helm client.helm search clearml
The
helm search
results must includeallegroai/upgrade-elastic-helm
.Install
upgrade-elastic-helm
on your cluster:helm install allegroai/upgrade-elastic-helm --namespace=upgrade-elastic --name upgrade
An upgrade-elastic
namespace
is created in your cluster, and the upgrade is deployed in it.Wait for the job to complete. To check if it completed, you can execute the following command:
kubectl get jobs -n upgrade-elastic
Finishing up¶
To finish up, first verify the data migration, and then conclude the upgrade.
Step 1. Verifying the data migration¶
Upon successful completion, the migration script renames the original Trains Server directory which contains the now migrated data, and prints a completion message:
Renaming the source directory /opt/trains/data/elastic to /opt/trains/data/elastic_migrated_<date_time>.
Upgrade completed.
All console output during the execution of the migration script is saved to a log file in the directory where the migration script executes:
<path_to_script>/upgrade_to_7_<date_time>.log
If the migration script does not complete successfully, the migration script prints the error.
Important
For help in resolving migration issues, check the allegro-clearml Slack Channel, GitHub Issues, ClearML Server and FAQ.
Step 2. Completing the installation¶
After verifying the data migration completed successfully, you must conclude the ClearML Server installation process.
Linux or macOS¶
For example, for Linux or macOS, conclude with the steps in this section. For other deployment formats, see below.
Important: Upgrading from v0.14 or older
For Linux only, if you are upgrading from Trains Server v0.14 or older, configure the ClearML Agent Services.
If
CLEARML_HOST_IP
is not provided, then ClearML Agent Services will use the external public address of the ClearML Server.If
CLEARML_AGENT_GIT_USER
/CLEARML_AGENT_GIT_PASS
are not provided, then ClearML Agent Services will not be able to access any private repositories for running service tasks.export CLEARML_HOST_IP=server_host_ip_here export CLEARML_AGENT_GIT_USER=git_username_here export CLEARML_AGENT_GIT_PASS=git_password_here
Note
For backwards compatibility, the environment variables TRAINS_HOST_IP
, TRAINS_AGENT_GIT_USER
, and TRAINS_AGENT_GIT_PASS
are supported.
We recommend backing up your data and, if your configuration folder is not empty, backing up your configuration.
For example, if your data and configuration folders are in
/opt/trains
, then archive all data into~/trains_backup_data.tgz
, and your configuration into~/trains_backup_config.tgz
:sudo tar czvf ~/trains_backup_data.tgz -C /opt/trains/data . sudo tar czvf ~/trains_backup_config.tgz -C /opt/trains/config .
Rename
/opt/trains
and its subdirectories to/opt/clearml
.sudo mv /opt/trains /opt/clearml
Download the latest
docker-compose.yml
file.curl https://raw.githubusercontent.com/allegroai/clearml-server/master/docker/docker-compose.yml -o /opt/clearml/docker-compose.yml
Startup ClearML Server. This automatically pulls the latest ClearML Server build.
docker-compose -f /opt/clearml/docker-compose.yml pull docker-compose -f /opt/clearml/docker-compose.yml up -d
If issues arise during your upgrade, see the FAQ page, How do I fix Docker upgrade errors?.
Other deployment formats¶
To conclude the upgrade for deployment formats other than Linux, follow their upgrade instructions: