Rollback a service during an incident

You can use a Cortex Workflow to roll back services, allowing quick remediation in the event of incidents.

How to roll back a service with a Cortex Workflow

Step 1: Start creating the Workflow

Follow the steps in the documentation to create a Workflow and configure its basic settings.

This Workflow is scoped to the "service" entity type.

Step 2: Add blocks to the Workflow

The instructions on this page describe how to create this Workflow in the Cortex UI, but it is also possible to copy the Workflow YAML and add it to your workspace via the Cortex CLI. This allows you to quickly set up the example configuration then iterate on it for your own use case. Expand the tile below to learn more.

Workflow YAML instructions

To upload the Workflow example YAML into your workspace:

  1. Save the Workflow example YAML file below:

name: Rollback Service
tag: workflow-0e405a5d-6a1c-41a6-84a8-d3287941c000
description: null
isDraft: true
filter:
  entityFilter:
    typeFilter:
      types:
      - service
    entityGroupFilter: null
  ownershipScope: ALL
  type: ENTITY
runResponseTemplate: null
failedRunResponseTemplate: null
restrictActionCompletionToRunnerUser: false
actions:
- name: Get incident info
  slug: get-incident-info
  schema:
    inputs:
    - name: Incident Name
      description: null
      key: incident-name
      required: false
      defaultValue: null
      placeholder: null
      validationRegex: null
      type: INPUT_FIELD
    - name: Incident Severity
      description: null
      key: incident-severity
      required: false
      options:
      - SEV0
      - SEV1
      - SEV2
      optionsLabels: null
      defaultValue: null
      placeholder: ""
      allowAdditionalOptions: false
      type: SELECT_FIELD
    - name: Create Incident Slack Channel?
      description: null
      key: create-incident-slack-channel
      required: true
      defaultValue: false
      type: TOGGLE_FIELD
    inputOverrides: []
    type: USER_INPUT
  outgoingActions:
  - list-deployments-for-entity
  isRootAction: true
- name: List deployments for entity
  slug: list-deployments-for-entity
  schema:
    inputs:
      page: 0
      entityId: "{{context.entity.tag}}"
      pageSize: 2
    integrationAlias: null
    actionIdentifier: cortex.getDeploysForEntity
    type: ADVANCED_HTTP_REQUEST
  outgoingActions:
  - javascript
  isRootAction: false
- name: Format deployments
  slug: javascript
  schema:
    script: "// Assume \"deployments\" is the JSON from the previous step\nconst deployments\
      \ = actions['list-deployments-for-entity'].outputs.response.deployments\n\n\
      // Create a flat array of strings formatted for a dropdown\nconst shasFormatted\
      \ = deployments.map(dep => \n  `${dep.title} (${dep.sha.slice(0,7)}) - ${dep.timestamp}`\n\
      );\n\n// Also return the raw SHAs if needed for later steps\nconst shas = deployments.map(dep\
      \ => dep.sha);\n\nreturn {\n  totalDeployments: deployments.length,\n  deployments:\
      \ deployments,       // full objects if you need them\n  shas: shas,       \
      \              // raw SHA values\n  shasFormatted: shasFormatted    // flat\
      \ string array for dropdowns\n};"
    type: JAVASCRIPT
  outgoingActions:
  - user-input
  isRootAction: false
- name: User input
  slug: user-input
  schema:
    inputs:
    - name: Commit ID to rollback to
      description: null
      key: commit-id-to-rollback-to
      required: false
      options: []
      optionsLabels: null
      defaultValue: null
      placeholder: null
      allowAdditionalOptions: false
      type: SELECT_FIELD
    inputOverrides:
    - inputKey: commit-id-to-rollback-to
      outputVariable: actions.javascript.outputs.result.shasFormatted
      type: OPTION
    type: USER_INPUT
  outgoingActions:
  - extract-sha
  isRootAction: false
- name: Extract SHA
  slug: extract-sha
  schema:
    script: |-
      // The selected option from the User Input block
      const selectedOption = actions['user-input'].outputs['commit-id-to-rollback-to']

      // Extract the text between parentheses
      const match = selectedOption.match(/\((.*?)\)/);
      const sha = match ? match[1] : null;

      return {
        selectedOption: selectedOption, // full string
        sha: sha                        // extracted commit SHA
      };
    type: JAVASCRIPT
  outgoingActions:
  - trigger-workflow
  isRootAction: false
- name: Trigger workflow
  slug: trigger-workflow
  schema:
    inputs:
      ref: main
      repo: cortex-workshops/deployment
      inputs: |-
        {
        "service": "{{context.entity.tag}}",
        "rollback_to": "{{actions.extract-sha.outputs.result.sha}}"
        }
      workflow_id: rollback.yaml
    integrationAlias: cortex
    actionIdentifier: github.createWorkflowDispatchEvent
    type: ADVANCED_HTTP_REQUEST
  outgoingActions:
  - branch
  isRootAction: false
- name: Branch
  slug: branch
  schema:
    branches:
    - name: Create incident Slack channel
      slug: create-incident-slack-channel-path
      outgoingAction: create-incident-slack-channel
      expression: "actions[\"get-incident-info\"].outputs[\"create-incident-slack-channel\"\
        ] == true"
      type: CONDITIONAL
    - name: Do not create Slack channel
      slug: do-not-create-slack-channel
      outgoingAction: send-fyi-message
      expression: "actions[\"get-incident-info\"].outputs[\"create-incident-slack-channel\"\
        ] == false"
      type: CONDITIONAL
    fallbackBranch: null
    joiningAction: null
    type: CONDITIONAL_BRANCH
  outgoingActions:
  - create-incident-slack-channel
  - send-fyi-message
  isRootAction: false
- name: Create Incident Slack Channel
  slug: create-incident-slack-channel
  schema:
    headers:
      Content-Type: application/json
      Authorization: "Bearer {{context.secrets.apiKey}}"
    httpMethod: POST
    payload: |-
      {
        "name": "incident-{{actions.get-incident-info.outputs.incident-name}}"
      }
    url: https://slack.com/api/conversations.create
    type: HTTP_REQUEST
  outgoingActions:
  - send-slack-message
  isRootAction: false
- name: Send FYI message
  slug: send-fyi-message
  schema:
    channel: public-channel
    message: "Service {{context.entity.tag}} with incident does not have a separate Slack channel."
    type: SLACK
  outgoingActions: []
  isRootAction: false
- name: Send Slack Message
  slug: send-slack-message
  schema:
    headers:
      Content-Type: application/json
      Authorization: "Bearer {{context.secrets.apiKey}}"
    httpMethod: POST
    payload: |-
      {
        "channel": "incident-{{actions.get-incident-info.outputs.incident-name}}",
        "text": "Rollbacked service {{context.entity.tag}} to commit ID {{actions.extract-sha.outputs.result.sha}}."
      }
    url: https://slack.com/api/chat.postMessage
    type: HTTP_REQUEST
  outgoingActions: []
  isRootAction: false
runRestrictionPolicies: []
iconTag: null
variables: []
  1. Use the Cortex CLI to run this command, using the path to your Workflow YAML file: cortex workflows create -f <path-to-your-workflow.yaml>

Expand the tiles below to learn about each block in this Workflow and how to configure them in the Cortex UI:

User input

In this example, we add a User Input block to obtain information about an incident.

  1. Click + in the center of the page. In the block library modal, choose User input.

  2. In the block configuration side panel, enter a name and unique slug for this block.

    1. In this example, we use the name Get Incident info and the slug get-incident-info.

  3. Click +Add user input. Add the following:

    • Name: Incident Name

    • Key: incident-name

    • Type: Text

    • Click Add input.

  4. Click +Add user input. Add the following:

    • Name: Incident Severity

    • Key: incident-severity

    • Type: Select

    • Options: Click +Add option to add options for the severity labels. In our example, we use SEV0, SEV1, and SEV2.

    • Click Add input.

  5. Click +Add user input. Add the following:

    • Name: Create incident Slack channel?

    • Key: create-incident-slack-channel

    • Type: Toggle

    • Default value: False

    • Click Add input.

  6. At the bottom of the side panel, click Save.

List deployments for entity

This block lists the deployments for an entity.

  1. Click + in the center of the page. In the block library modal, select the Cortex > List deployments for entity block.

  2. In the side panel, enter a name and unique slug for the block.

    • In this example, we use the name List deployments for entity and the slug list-deployments-for-entity.

  3. Configure the block:

    • Page size: In our example, we configured 2.

  4. At the bottom of the side panel, click Save.

JavaScript

This block uses JavaScript to format the deployments.

  1. Click + in the center of the page. In the block library modal, select the JavaScript block.

  2. In the side panel, enter a name and unique slug for the block.

    • In this example, we use the name Format Deployments and the slug javascript.

  3. In the JavaScript text editor box, enter an expression. Our example uses the following:

/*#CORTEX_IGNORE*/import {_, YAML, hcl, context, actions, variables, fetch} from './utils.js'; async () => {
// Assume "deployments" is the JSON from the previous step
const deployments = actions['list-deployments-for-entity'].outputs.response.deployments

// Create a flat array of strings formatted for a dropdown
const shasFormatted = deployments.map(dep => 
  `${dep.title} (${dep.sha.slice(0,7)}) - ${dep.timestamp}`
);

// Also return the raw SHAs if needed for later steps
const shas = deployments.map(dep => dep.sha);

return {
  totalDeployments: deployments.length,
  deployments: deployments,       // full objects if you need them
  shas: shas,                     // raw SHA values
  shasFormatted: shasFormatted    // flat string array for dropdowns
};
/*#CORTEX_IGNORE*/}
  1. Save the block.

User input

This block asks the user for the commit ID to roll back to, pulling the commit IDs from the previous step.

  1. Click + in the center of the page. In the block library modal, select the User input block.

  2. In the side panel, enter a name and unique slug for the block.

    • In this example we use the name User input and the slug user-input.

  3. Click +Add user input. Add the following:

    • Name: Commit ID to rollback to

    • Key: commit-id-to-rollback-to

    • Type: Select

    • Data source: Manual

    • Click Add input.

    • Path to override value: Enter actions.javascript.outputs.result.shasFormatted.

  4. Save the block.

JavaScript

This block uses JavaScript to extract the SHA that was selected in the previous step.

  1. Click + in the center of the page. In the block library modal, select the JavaScript block.

  2. In the side panel, enter a name and unique slug for the block.

    • In this example, we use the name Extract SHA and the slug extract-sha.

  3. In the JavaScript text editor box, enter an expression. Our example uses the following:

/*#CORTEX_IGNORE*/import {_, YAML, hcl, context, actions, variables, fetch} from './utils.js'; async () => {
// The selected option from the User Input block
const selectedOption = actions['user-input'].outputs['commit-id-to-rollback-to']

// Extract the text between parentheses
const match = selectedOption.match(/\((.*?)\)/);
const sha = match ? match[1] : null;

return {
  selectedOption: selectedOption, // full string
  sha: sha                        // extracted commit SHA
};
/*#CORTEX_IGNORE*/}
  1. Save the block.

Trigger GitHub workflow

This block triggers a GitHub workflow to roll back the deployment.

  1. Click + in the center of the page. In the block library modal, select the GitHub > Trigger workflow block.

  2. In the side panel, enter a name and unique slug for the block. In this example, we use the name Trigger workflow and the slug trigger-workflow.

  3. Configure the block:

    • Repository: Enter your repository name.

    • Ref: Enter the Git reference to trigger the workflow on, e.g. main.

    • Workflow ID or file name: Enter the ID of the GitHub workflow or the file name of the workflow file.

      • In our example, we added rollback.yaml.

    • Inputs: Optionally enter key-value pairs to pass to the workflow.

      • In our example, we added: { "service": "{{context.entity.tag}}", "rollback_to": "{{actions.extract-sha.outputs.result.sha}}" }

  4. At the bottom of the side panel, click Save.

Branch

During the first User input step, the user can choose whether or not to create a Slack channel for the incident. Their choice determines which path will be followed during the Branch block.

If they chose to create an incident Slack channel:

  1. Click + in the center of the page. In the block library modal, choose Branch.

  2. In the block configuration side panel, enter a name and unique slug for this block.

    • In this example, we use the name Branch and the slug branch.

  3. Click +Add path. Configure the conditional path:

    • Name: Create incident Slack channel

    • Slug: create-incident-slack-channel-path

    • Path expression: actions["get-incident-info"].outputs["create-incident-slack-channel"] == true

    • Save the path.

  4. Click +Add path. Configure the conditional path:

    • Name: Do not create Slack channel

    • Slug: do-not-create-slack-channel

    • Path expression: actions["get-incident-info"].outputs["create-incident-slack-channel"] == false

    • Save the path.

  5. Save the block.

Add blocks to the "Create incident Slack channel" path

HTTP requests

  1. Click + under the new path. In the block library modal, choose HTTP request.

  2. In the block configuration side panel, enter a name and unique slug for this block.

    • In this example, we use the name Create incident Slack channel and the slug create-incident-slack-channel.

  3. Configure the block:

    • HTTP method: POST

    • URL: https://slack.com/api/conversations.create

    • Headers:

      • Content-Type: application/json

      • Authorization: Bearer {{context.secrets.apiKey}}

    • Payload: In our example, we enter the following:

{
  "name": "incident-{{actions.get-incident-info.outputs.incident-name}}"
}
  1. Save the block.

  2. Click + under the path. In the block library modal, choose HTTP request.

  3. In the block configuration side panel, enter a name and unique slug for this block. In this example, we use the name Send Slack message and the slug send-slack-message.

  4. Configure the block:

    1. HTTP method: POST

    2. URL: https://slack.com/api/conversations.create

    3. Headers:

      • Content-Type: application/json

      • Authorization: Bearer {{context.secrets.apiKey}}

    4. Payload: In our example, we enter the following:

{
  "channel": "incident-{{actions.get-incident-info.outputs.incident-name}}",
  "text": "Rollbacked service {{context.entity.tag}} to commit ID {{actions.extract-sha.outputs.result.sha}}."
}
  1. Save the block.

Add block to the "Do not create Slack channel" path

  1. Click + under the new path. In the block library modal, choose Slack > Send message.

  2. In the block configuration side panel, enter a name and unique slug for this block.

    • In this example, we use the name Send FYI message and the slug send-fyi-message.

  3. Configure the block:

    • Slack channel name: Select the channel where you want to send a message informing the team that you did not create a separate Slack channel for the incident.

    • Message text: In our example, we set this to: Service {{context.entity.tag}} with incident does not have a separate Slack channel.

  1. Save the block.

Step 3: Run the Workflow

  • At the top of the page, click Run Workflow.

When you run the Workflow, the following events happen:

  1. The Workflow pauses to collect a response from the user during the User Input block. The user enters a name and severity for the incident, and chooses whether to create a Slack channel

    • Some incidents require additional team work and collaboration, while others can be easily mitigated and may not require a dedicated channel.

  2. The "List deployments for entity" block runs, fetching a list of deployments associated with the entity.

  3. The JavaScript block runs, which formats the deployment data.

  4. The next User Input block runs, which uses the formatted data from the previous block to provide a commit ID to rollback to.

  5. The second JavaScript block runs, which extracts the SHA from the output of the previous block.

  6. The GitHub workflow is triggered to roll back the affected service.

  7. The Branch block runs:

    • If the user selected to create a Slack channel: An HTTP request runs to create an incident Slack channel, and an HTTP block runs to send a Slack message into the channel including the incident name, the service being rolled back, and the commit ID.

    • If the user selected to not create a Slack channel: The "Do not create Slack channel" path runs. A Slack block runs, which sends a Slack message to an existing team channel to let them know a separate incident channel was not created.

Last updated

Was this helpful?