PyTorch Ignite

Integrate Trains with ignite. Use the ignite TrainsLogger, and the handlers you can attach to it. See ignite's trains_logger.py.

To install Trains:

pip install trains

By default, Trains works with our demo Trains Server (https://demoapp.trains.allegro.ai/dashboard). You can deploy a self-hosted Trains Server, see the Deploying Trains Overview, and configure Trains to meet your requirements, see the Trains Configuration Reference page.

Ignite TrainsLogger

Integrate Trains by creating an Ignite TrainsLogger object. When the code runs, it connects to the Trains backend, and creates a Task (experiment) in Trains.

from ignite.contrib.handlers.trains_logger import *

trains_logger = TrainsLogger(project_name="examples", task_name="ignite")

Later in the code, attach any of the Trains handlers to the TrainsLogger object.

For example, attach the OutputHandler and log training loss at each iteration:

trains_logger.attach(trainer,
    log_handler=OutputHandler(tag="training",
    output_transform=lambda loss: {"loss": loss}),
    event_name=Events.ITERATION_COMPLETED)

TrainsLogger parameters

The TrainsLogger method parameters are the following:

  • project_name (optional[str]) – The name of the project in which the experiment will be created. If the project does not exist, it is created. If project_name is None, the repository name becomes the project name.
  • task_name (optional[str]) – The name of Task (experiment). If task_name is None, the Python experiment script’s file name becomes the Task name.
  • task_type (optional[str]) – The name of the experiment. The default is training.

    The task_type values include:

    • TaskTypes.training (default)
    • TaskTypes.train
    • TaskTypes.testing
    • TaskTypes.inference
  • report_freq (optional[int]) – The histogram processing frequency (handles histogram values every X calls to the handler). Affects GradsHistHandler and WeightsHistHandler. Default value is 100.

  • histogram_update_freq_multiplier (optional[int]) – The histogram report frequency (report first X histograms and once every X reports afterwards). Default value is 10.
  • histogram_granularity (optional[int]): Optional. Histogram sampling granularity. Default is 50.

Visualizing experiment results

After creating an Ignite TrainsLogger object and attaching handlers in trains_logger.py, when the code runs, you can visualize the experiment results in the Trains Web-App (UI).

Scalars

For example, run the Ignite MNIST example for TrainsLogger, mnist_with_trains_logger.py.

To log scalars, use OutputHandler.

trains_logger.attach(
    trainer,
    log_handler=OutputHandler(
        tag="training", output_transform=lambda loss: {"batchloss": loss}, metric_names="all"
    ),
    event_name=Events.ITERATION_COMPLETED(every=100),
    )

trains_logger.attach(
    train_evaluator,
    log_handler=OutputHandler(tag="training", metric_names=["loss", "accuracy"], 
        another_engine=trainer),
    event_name=Events.EPOCH_COMPLETED,
)

View the scalars in the Trains Web-App (UI), RESULTS tab, SCALARS sub-tab, view training and validation metrics.

image

trains_logger.attach(
    validation_evaluator,
    log_handler=OutputHandler(tag="validation", metric_names=["loss", "accuracy"], 
        another_engine=trainer),
    event_name=Events.EPOCH_COMPLETED,
)

image

Model snapshots

To save model snapshots, use TrainsServer.

handler = Checkpoint(
        {"model": model},
        TrainsSaver(trains_logger, dirname="~/.trains/cache/"),
        n_saved=1,
        score_function=lambda e: 123,
        score_name="acc",
        filename_prefix="best",
        global_step_transform=global_step_from_engine(trainer),
    )

View saved snapshots in the Trains Web-App (UI), ARTIFACTS tab.

image

To view the model, in the ARTIFACTS tab, click the model name (or download it).

image

Logging

Ignite engine output and / or metrics

To log the Ignite engine's output and / or metrics, use the OutputHandler handler.

For example, log training loss at each iteration.

# Attach the logger to the trainer to log training loss at each iteration
trains_logger.attach(trainer,
    log_handler=OutputHandler(tag="training",
    output_transform=lambda loss: {"loss": loss}),
    event_name=Events.ITERATION_COMPLETED)

Log metrics for training.

# Attach the logger to the evaluator on the training dataset and log NLL, Accuracy metrics after each epoch
# We setup `global_step_transform=global_step_from_engine(trainer)` to take the epoch
# of the `trainer` instead of `train_evaluator`.
trains_logger.attach(train_evaluator,
    log_handler=OutputHandler(tag="training",
        metric_names=["nll", "accuracy"],
        global_step_transform=global_step_from_engine(trainer)),
    event_name=Events.EPOCH_COMPLETED)

Log metrics for validation.

# Attach the logger to the evaluator on the validation dataset and log NLL, Accuracy metrics after
# each epoch. We setup `global_step_transform=global_step_from_engine(trainer)` to take the epoch of the
# `trainer` instead of `evaluator`.
trains_logger.attach(evaluator,
    log_handler=OutputHandler(tag="validation",
        metric_names=["nll", "accuracy"],
        global_step_transform=global_step_from_engine(trainer)),
    event_name=Events.EPOCH_COMPLETED)

Optimizer parameters

To log optimizer parameters, use the OptimizerParamsHandler handler.

# Attach the logger to the trainer to log optimizer's parameters, e.g. learning rate at each iteration
trains_logger.attach(trainer, 
    log_handler=OptimizerParamsHandler(optimizer),
    event_name=Events.ITERATION_STARTED)

Model weights

To log model weights as scalars, use the WeightsScalarHandler handler.

# Attach the logger to the trainer to log model's weights norm after each iteration
trains_logger.attach(trainer,
    log_handler=WeightsScalarHandler(model, reduction=torch.norm),
    event_name=Events.ITERATION_COMPLETED)

To log model weights as histograms, use the WeightsHistHandler handler.

# Attach the logger to the trainer to log model's weights norm after each iteration
trains_logger.attach(trainer,
    log_handler=WeightsHistHandler(model),
    event_name=Events.ITERATION_COMPLETED)

Model snapshots

To save input snapshots as Trains artifacts, use TrainsSaver.

to_save = {"model": model}

handler = Checkpoint(to_save, TrainsSaver(trains_logger), n_saved=1,
    score_function=lambda e: 123, score_name="acc",
    filename_prefix="best",
    global_step_transform=global_step_from_engine(trainer))

validation_evaluator.add_event_handler(Events.EVENT_COMPLETED, handler)

MNIST example

The ignite repository contains an MNIST TrainsLogger example, mnist_with_trains_logger.py.

When you run this code, visualize the experiment results in the Trains Web-App (UI), see Visualizing experiment results.