Version 0.13

Version 0.13.3

Trains

Features and Bug Fixes

  • Add a binding for tensorboard.summarywriter.addscalars
  • Add the tensorboard_single_series_per_graph() method which supports separate plots for each TensorBoard scalar.
  • Add the Task.set_base_docker() and Task.get_base_docker() methods for the base Docker image used by Trains Agent.
  • Add support for the standard OS environment variables to obtain default credentials for:
    • AWS: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_DEFAULT_REGION.
    • Azure Storage: AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY.
    • Google Cloud Storage: GOOGLE_APPLICATION_CREDENTIALS.
  • Add the Task.get_parameters_as_dict() and Task.set_parameters_as_dict() methods supporting get / set of parameters from referenced Tasks (use the Task.get_task() to get a reference).
  • Make sure Task.connect() always returns the connected instance passed to it.
  • tensorflow_gpu takes precedence over tensorflow when Trains detects installed packages to record experiment dependencies.
  • Remove title and series naming restrictions (allow $ and .) when reporting metrics.
  • Fix incorrect printouts in initialization wizard and upgrade notifications
  • Fix debug images URL for uploaded files with % in their name

Trains Agent

Features and Bug Fixes

  • Allow providing queue names instead of queue IDs in daemon mode.
  • Docker mode improvements:
    • Support running as a specific user inside a docker using the TRAINS_AGENT_EXEC_USER environment flag.
    • Pass the correct GPU limit when skipping gpus flag.
    • Add the --force-current-version daemon command-line flag.
  • Add K8s/trains glue service example
  • Added K8s support in daemon mode
    • Running inside a K8s pod
    • Mounting dockerized experiment folders to host
    • Allow a specific network for the docker
  • Add default storage environment vars (for AWS, GS and Azure) to generated agent configuration
  • Improve Unicode/UTF stdout handling

Version 0.13.2

Trains

Features and Bug Fixes

  • Allow reporting a pre-uploaded image url in (Logger.report_image() using the (url parameter.
  • Add support for Git repositories without a .git suffix, for example Azure Repos.
  • Improve conda support.
  • Improve hyper-parameters (argparser integration.
  • Fix (savefig() patching in matplotlib binding.
  • Fix logs, events and Jupyter Notebook flushing on exit.

Version 0.13.1

Trains

Features and Bug Fixes

  • Add support for pyplot.savefig and pylab.savefig in matplotlib binding.
  • Add support for SageMaker.
  • Improved configuration wizard.
  • Try to make sure TensorBoard is available when using torch.
  • Do not store keras model network design if it cannot be serialized (Issue #72).
  • Fix matplotlib binding support.

Version 0.13.0

Trains

Features and Bug Fixes

  • Add support for (trains-server v0.13.0.
  • Add support for nested (non-main) tasks.
  • Add warning when automatic argument parser binding cannot be turned off.
  • Add Task.upload_artifact() support for external URLs (pre-uploaded).
  • Add support for special characters in hyper-parameter keys (white-spaces, . and $) (Issue #69).
  • Add support for PyTorch .pt model files.
  • Calculate data-audit artifact uniqueness by user-criteria (Issue #45).
  • Use an environment variable for setting a default docker image (Issue #58).
  • Improve trains-init configuration wizard.
  • Update examples for new joblib versions.
  • Update jupyter example to TensorFlow 2.
  • Fix task clone to copy only input artifacts.
  • Fix matplotlib import binding when using Agg backend.
  • Fix ProxyDictPreWrite and ProxyDictPostWrite so they can be pickled correctly (Issue #72).
  • Fix requests issue in Python 2.7 that can cause a deadlock when importing netrc.
  • Fix argparser binding sub-parser and type casting support (Issue #74).
  • Fix argparser binding Python 2.7 unicode handling.
  • Fix unsynced connected hyper parameters overridden during remote execution.

Trains Server

Features and Bug Fixes

  • Add parallel coordinates hyper-parameter comparison, available under Compare Experiments -> Hyper Parameters -> Parallel Coordinates (in the drop-down) (Issue #53).
  • Add encoding of experiment table view settings in URL to allow sharing using browser URL copy/paste.
  • Add loguru (ANSI color) support (Issue #29).
  • Add support for special characters in hyper-parameter keys (white-spaces, . and $) (Issue #69).
  • Add optional anonymous daily usage statistics (help us improve Trains Server):
  • Disabled by default.
  • Requires user opt-in.
  • Single averages report per day.
  • Reports average load metrics per day (CPU/memory).
  • Reports average workload per day (amount and average duration of queues, agents and experiments).
  • Improve experiment table filtering indication.
  • Improve model view to allow navigating to its generating experiment.
  • Fix experiment comparison to distinguish between experiments with the same name (Issue #52).
  • Fix Web UI compare plots bug (Issue #55), (Issue #73).

Pull the Docker image:

docker pull allegroai/trains:0.13.0

Trains Agent

Features

  • Add support for Docker pre-installed pytorch versions that do not exist on PyPI/PyTorch.org.
  • Add AWS dynamic cluster management service.
  • Add support for various event query endpoints in APIClient.
  • Improve the configuration wizard.