Explicit Reporting Tutorial

In this tutorial, you learn how to extend ClearML automagical capturing of inputs and outputs with explicit reporting. We add the following to one of the example scripts from the ClearML repository, pytorch_mnist.py:

  • Setting an output destination for model checkpoints (snapshots).

  • Explicitly logging a scalar, other (not scalar) data, and logging text.

  • Registering an artifact, which is uploaded to ClearML Server, and ClearML logs changes to it.

  • Uploading an artifact, which is uploaded, but changes to it are not logged.


  • The clearml repository is cloned.

  • The clearml package is installed.

Before you begin

Make a copy of pytorch_mnist.py so that you can add explicit reporting to it.

  • In your local ClearML repository, example directory.

      cp pytorch_mnist.py pytorch_mnist_tutorial.py

Step 1. Setting an output destination for model checkpoints

A default output location allows you to specify where model checkpoints (snapshots) and artifacts will be stored when the experiment runs. You can use a local destination, a shared folder, and cloud storage, such as S3 EC2, Google Cloud Storage, and Azure Storage. Specify the output location in the Task.init method, output_uri parameter. In this tutorial, we specify a local folder destination.

In pytorch_mnist_tutorial.py, change the code from:

task = Task.init(project_name='examples', task_name='pytorch mnist train')


model_snapshots_path = '/mnt/clearml'
if not os.path.exists(model_snapshots_path):

task = Task.init(project_name='examples', 
    task_name='extending automagical ClearML example', 

When the script runs, ClearML creates the following directory structure:

+ - <output destination name>
|   +-- <project name>
|       +-- <task name>.<Task Id>
|           +-- models
|           +-- artifacts

and puts the model checkpoints (snapshots) and artifacts in that folder.

For example, if the Task ID is 9ed78536b91a44fbb3cc7a006128c1b0, then the directory structure will be:

+ - model_snapshots
|   +-- examples
|       +-- extending automagical ClearML example.9ed78536b91a44fbb3cc7a006128c1b0
|           +-- models
|           +-- artifacts

Step 2. Logger class reporting methods

In addition to ClearML automagical logging, the ClearML Python package contains methods for explicit reporting of plots, log text, media, and tables. These methods include:

Get a logger

First, first create a logger for the Task using the Task.get_logger method.

logger = task.get_logger

Plot scalar metrics

Add scalar metrics using the Logger.report_scalar method to report loss metrics.

def train(args, model, device, train_loader, optimizer, epoch):

    save_loss = []

    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        output = model(data)
        loss = F.nll_loss(output, target)


        if batch_idx % args.log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                       100. * batch_idx / len(train_loader), loss.item()))
            # Add manual scalar reporting for loss metrics
            logger.report_scalar(title='Scalar example {} - epoch'.format(epoch), 
                series='Loss', value=loss.item(), iteration=batch_idx)

Plot other (not scalar) data

Our script contains a function named test which determines loss and correct for the trained model. We add a histogram and confusion matrix to log them.

def test(args, model, device, test_loader):

    save_test_loss = []
    save_correct = []

    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            # sum up batch loss
            test_loss += F.nll_loss(output, target, reduction='sum').item()
            # get the index of the max log-probability
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()


    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

    logger.report_histogram(title='Histogram example', series='correct',
        iteration=1, values=save_correct, xaxis='Test', yaxis='Correct')

    # Manually report test loss and correct as a confusion matrix
    matrix = np.array([save_test_loss, save_correct])
    logger.report_confusion_matrix(title='Confusion matrix example', 
        series='Test loss / correct', matrix=matrix, iteration=1)

Log text

You can extend ClearML by explicitly logging text, including errors, warnings, and debugging statements. We use the Logger.report_text method and its argument level to report the a debugging message.

logger.report_text('The default output destination for model snapshots and artifacts is: {}'.format(model_snapshots_path ), level=logging.DEBUG)

Step 3. Registering artifacts

Registering an artifact uploads it to ClearML Server, and if it changes, the change is logged in ClearML Server. Currently, ClearML supports Pandas DataFrames as registered artifacts.

Register the artifact

In the tutorial script, test function, we can assign the test loss and correct data to a Pandas DataFrame object and register that Pandas DataFrame using the Task.register_artifact method.

# Create the Pandas DataFrame
test_loss_correct = {
        'test lost': save_test_loss,
        'correct': save_correct
df = pd.DataFrame(test_loss_correct, columns=['test lost','correct'])

# Register the test loss and correct as a Pandas DataFrame artifact
task.register_artifact('Test_Loss_Correct', df, metadata={'metadata string': 'apple', 
    'metadata int': 100, 'metadata dict': {'dict string': 'pear', 'dict int': 200}})

Reference the registered artifact

Once an artifact is registered, you can reference it in your Python experiment script and work with it.

In the tutorial script, we add Task.current_task and Task.get_registered_artifacts methods to take a sample.

# Once the artifact is registered, we can get it and work with it. Here, we sample it.
sample = Task.current_task().get_registered_artifacts()['Test_Loss_Correct'].sample(frac=0.5, 
    replace=True, random_state=1)

Step 4. Uploading artifacts

Uploading an artifact uploads it to ClearML Server, but changes are not logged.

  • Pandas DataFrames

  • Files of any type, including image files

  • Folders - stored as ZIP files

  • Images - stored as PNG files

  • Dictionaries - stored as JSONs

  • Numpy arrays - stored as NPZ files

In the tutorial script, we upload the loss data as an artifact using the Task.upload_artifact method with metadata specified in the metadata parameter.

# Upload test loss as an artifact. Here, the artifact is numpy array
    metadata={'metadata string': 'banana', 'metadata integer': 300,
    'metadata dictionary': {'dict string': 'orange', 'dict int': 400}})

Additional information

After extending the Python experiment script, we can run it and view the results in the ClearML Web UI.

python pytorch_mnist_tutorial.py

To view the experiment results, do the following:

  1. In the ClearML Web UI, on the Projects page, click the examples project.

  2. In the experiments table, click the Extending automagical ClearML example experiment.

  3. In the ARTIFACTS tab, DATA AUDIT section, click Test_Loss_Correct. The registered Pandas DataFrame appears, including the file path, size, hash, metadata, and a preview.

  4. In the OTHER section, click Loss. The uploaded numpy array appears, including its related information.

  5. Click the RESULTS tab.

  6. Click the LOG sub-tab. You can see the debugging message showing the Pandas DataFrame sample.

  7. Click the SCALARS sub-tab. You can see the scalar plots for epoch logging loss.

  8. Click the PLOTS sub-tab. You can see the confusion matrix and histogram.

Next Steps