Monitoring Service Posting Slack Alerts

The slack_alerts.py example demonstrates a Trains monitoring service that posts alert messages on a Slack Channel. The alerts are messages containing information about Task completions and failures. The example creates a SlackMonitor class, inheriting from the Trains Monitor class, and creating a custom monitor.

The custom monitor overrides the following Trains Monitor class methods:

  • get_query_parameters - The query parameters for monitoring, for example:
    {'status': ['failed'], 'order_by': ['-last_update']}
  • process_task - Get the information for a Task, post a Slack message, and output to console.

The example provides the option to run locally or execute remotely, by calling the Task.execute_remotely method.

Trains automatically logs the example's command line arguments, because it uses argparse

To interface to Slack, the example uses slack.WebClient and slack.errors.SlackApiError.

Before running the script

Create a new Slack Bot (Allegro Trains Bot), before executing the slack_alerts.py script. This will provide you with a Slack API token, which the example script requires.

  1. Login to your Slack account.
  2. Go to https://api.slack.com/apps/new.
  3. In App Name, enter your app name; for example, "Allegro Trains Bot".
  4. In Development Slack Workspace, select your workspace.
  5. Click Create App.
  6. In Basic Information, under Display Information, complete the following:
    • In Short description, enter "Allegro Train Bot".
    • In Background color, enter "#202432".
  7. Click Save Changes.
  8. In OAuth & Permissions, under Scopes, click Add an OAuth Scope, and then select the following permissions on the list:
    • channels:join
    • channels:read
    • chat:write
  9. In OAuth Tokens & Redirect URLs:
    1. Click Install App to Workspace
    2. In the confirmation dialog, click Allow.
    3. Click Copy to copy the Bot User OAuth Access Token.

To use the Slack API Token in the Allegro Trains Slack service, execute slack_alerts.py with the command line argument slack_api and the API token in double quotes:

--slack_api "<api_token_here>"

Running the monitoring service

  1. Trains Agent must be running in services mode and listening to the services queue, so that when the script calls the Task.execute_remotely method to enqueue the monitoring Task, Trains Agent can begin the service.

    For example:

    trains-agent daemon --services-mode --detached --queue services --create-queue --docker  --cpu-only

  2. Run the example script. The arguments are:

    • channel - The name of the Slack channel to post alerts to. (MANDATORY)
    • slack_api - The Slack API key. The default value can be set in the environment variable, SLACK_API_TOKEN. (MANDATORY)
    • include_completed_experiments - Include completed experiments. The default value is False.
    • include_manual_experiments - Include experiments that are running locally, not only those Trains Agent is executing. The default value is True.
    • local - Run the monitor locally, instead of as a service. The default is False.
    • message_prefix - A message prefix. For example, to alert all channel members use: "Hey <!here>,"
    • min_num_iterations - The minimum number of iterations of failed/completed experiment to alert. The default is 0, indicating all alerts.
    • project - The name (or partial name) of the project to monitor, use empty for all projects.
    • refresh_rate - How often to run the monitoring service (seconds). The default value is 10.0.
    • service_queue - The queue that trains-agent is listening to for Tasks to execute as a service. The default is services.

    python slack_alerts.py --slack_api <key> --channel <channel_name>

Hyperparameters

Command line arguments, which are automatically logged when argparse is used, appear in the HYPER PARAMETERS tab.