Rollback a service during an incident
You can use a Cortex Workflow to roll back services, allowing quick remediation in the event of incidents.
How to roll back a service with a Cortex Workflow
Step 1: Start creating the Workflow
Follow the steps in the documentation to create a Workflow and configure its basic settings.
This Workflow is scoped to the "service" entity type.
Step 2: Add blocks to the Workflow
The instructions on this page describe how to create this Workflow in the Cortex UI, but it is also possible to copy the Workflow YAML and add it to your workspace via the Cortex CLI. This allows you to quickly set up the example configuration then iterate on it for your own use case. Expand the tile below to learn more.
Workflow YAML instructions
To upload the Workflow example YAML into your workspace:
Save the Workflow example YAML file below:
name: Rollback Service
tag: workflow-0e405a5d-6a1c-41a6-84a8-d3287941c000
description: null
isDraft: true
filter:
entityFilter:
typeFilter:
types:
- service
entityGroupFilter: null
ownershipScope: ALL
type: ENTITY
runResponseTemplate: null
failedRunResponseTemplate: null
restrictActionCompletionToRunnerUser: false
actions:
- name: Get incident info
slug: get-incident-info
schema:
inputs:
- name: Incident Name
description: null
key: incident-name
required: false
defaultValue: null
placeholder: null
validationRegex: null
type: INPUT_FIELD
- name: Incident Severity
description: null
key: incident-severity
required: false
options:
- SEV0
- SEV1
- SEV2
optionsLabels: null
defaultValue: null
placeholder: ""
allowAdditionalOptions: false
type: SELECT_FIELD
- name: Create Incident Slack Channel?
description: null
key: create-incident-slack-channel
required: true
defaultValue: false
type: TOGGLE_FIELD
inputOverrides: []
type: USER_INPUT
outgoingActions:
- list-deployments-for-entity
isRootAction: true
- name: List deployments for entity
slug: list-deployments-for-entity
schema:
inputs:
page: 0
entityId: "{{context.entity.tag}}"
pageSize: 2
integrationAlias: null
actionIdentifier: cortex.getDeploysForEntity
type: ADVANCED_HTTP_REQUEST
outgoingActions:
- javascript
isRootAction: false
- name: Format deployments
slug: javascript
schema:
script: "// Assume \"deployments\" is the JSON from the previous step\nconst deployments\
\ = actions['list-deployments-for-entity'].outputs.response.deployments\n\n\
// Create a flat array of strings formatted for a dropdown\nconst shasFormatted\
\ = deployments.map(dep => \n `${dep.title} (${dep.sha.slice(0,7)}) - ${dep.timestamp}`\n\
);\n\n// Also return the raw SHAs if needed for later steps\nconst shas = deployments.map(dep\
\ => dep.sha);\n\nreturn {\n totalDeployments: deployments.length,\n deployments:\
\ deployments, // full objects if you need them\n shas: shas, \
\ // raw SHA values\n shasFormatted: shasFormatted // flat\
\ string array for dropdowns\n};"
type: JAVASCRIPT
outgoingActions:
- user-input
isRootAction: false
- name: User input
slug: user-input
schema:
inputs:
- name: Commit ID to rollback to
description: null
key: commit-id-to-rollback-to
required: false
options: []
optionsLabels: null
defaultValue: null
placeholder: null
allowAdditionalOptions: false
type: SELECT_FIELD
inputOverrides:
- inputKey: commit-id-to-rollback-to
outputVariable: actions.javascript.outputs.result.shasFormatted
type: OPTION
type: USER_INPUT
outgoingActions:
- extract-sha
isRootAction: false
- name: Extract SHA
slug: extract-sha
schema:
script: |-
// The selected option from the User Input block
const selectedOption = actions['user-input'].outputs['commit-id-to-rollback-to']
// Extract the text between parentheses
const match = selectedOption.match(/\((.*?)\)/);
const sha = match ? match[1] : null;
return {
selectedOption: selectedOption, // full string
sha: sha // extracted commit SHA
};
type: JAVASCRIPT
outgoingActions:
- trigger-workflow
isRootAction: false
- name: Trigger workflow
slug: trigger-workflow
schema:
inputs:
ref: main
repo: cortex-workshops/deployment
inputs: |-
{
"service": "{{context.entity.tag}}",
"rollback_to": "{{actions.extract-sha.outputs.result.sha}}"
}
workflow_id: rollback.yaml
integrationAlias: cortex
actionIdentifier: github.createWorkflowDispatchEvent
type: ADVANCED_HTTP_REQUEST
outgoingActions:
- branch
isRootAction: false
- name: Branch
slug: branch
schema:
branches:
- name: Create incident Slack channel
slug: create-incident-slack-channel-path
outgoingAction: create-incident-slack-channel
expression: "actions[\"get-incident-info\"].outputs[\"create-incident-slack-channel\"\
] == true"
type: CONDITIONAL
- name: Do not create Slack channel
slug: do-not-create-slack-channel
outgoingAction: send-fyi-message
expression: "actions[\"get-incident-info\"].outputs[\"create-incident-slack-channel\"\
] == false"
type: CONDITIONAL
fallbackBranch: null
joiningAction: null
type: CONDITIONAL_BRANCH
outgoingActions:
- create-incident-slack-channel
- send-fyi-message
isRootAction: false
- name: Create Incident Slack Channel
slug: create-incident-slack-channel
schema:
headers:
Content-Type: application/json
Authorization: "Bearer {{context.secrets.apiKey}}"
httpMethod: POST
payload: |-
{
"name": "incident-{{actions.get-incident-info.outputs.incident-name}}"
}
url: https://slack.com/api/conversations.create
type: HTTP_REQUEST
outgoingActions:
- send-slack-message
isRootAction: false
- name: Send FYI message
slug: send-fyi-message
schema:
channel: public-channel
message: "Service {{context.entity.tag}} with incident does not have a separate Slack channel."
type: SLACK
outgoingActions: []
isRootAction: false
- name: Send Slack Message
slug: send-slack-message
schema:
headers:
Content-Type: application/json
Authorization: "Bearer {{context.secrets.apiKey}}"
httpMethod: POST
payload: |-
{
"channel": "incident-{{actions.get-incident-info.outputs.incident-name}}",
"text": "Rollbacked service {{context.entity.tag}} to commit ID {{actions.extract-sha.outputs.result.sha}}."
}
url: https://slack.com/api/chat.postMessage
type: HTTP_REQUEST
outgoingActions: []
isRootAction: false
runRestrictionPolicies: []
iconTag: null
variables: []
Use the Cortex CLI to run this command, using the path to your Workflow YAML file:
cortex workflows create -f <path-to-your-workflow.yaml>
Expand the tiles below to learn about each block in this Workflow and how to configure them in the Cortex UI:
User input
In this example, we add a User Input block to obtain information about an incident.
Click + in the center of the page. In the block library modal, choose User input.
In the block configuration side panel, enter a name and unique slug for this block.
In this example, we use the name
Get Incident info
and the slugget-incident-info
.
Click +Add user input. Add the following:
Name: Incident Name
Key: incident-name
Type: Text
Click Add input.
Click +Add user input. Add the following:
Name: Incident Severity
Key: incident-severity
Type: Select
Options: Click +Add option to add options for the severity labels. In our example, we use
SEV0
,SEV1
, andSEV2
.Click Add input.
Click +Add user input. Add the following:
Name: Create incident Slack channel?
Key: create-incident-slack-channel
Type: Toggle
Default value: False
Click Add input.
At the bottom of the side panel, click Save.
List deployments for entity
This block lists the deployments for an entity.
Click + in the center of the page. In the block library modal, select the Cortex > List deployments for entity block.
In the side panel, enter a name and unique slug for the block.
In this example, we use the name
List deployments for entity
and the sluglist-deployments-for-entity
.
Configure the block:
Page size: In our example, we configured
2
.
At the bottom of the side panel, click Save.
JavaScript
This block uses JavaScript to format the deployments.
Click + in the center of the page. In the block library modal, select the JavaScript block.
In the side panel, enter a name and unique slug for the block.
In this example, we use the name
Format Deployments
and the slugjavascript
.
In the JavaScript text editor box, enter an expression. Our example uses the following:
/*#CORTEX_IGNORE*/import {_, YAML, hcl, context, actions, variables, fetch} from './utils.js'; async () => {
// Assume "deployments" is the JSON from the previous step
const deployments = actions['list-deployments-for-entity'].outputs.response.deployments
// Create a flat array of strings formatted for a dropdown
const shasFormatted = deployments.map(dep =>
`${dep.title} (${dep.sha.slice(0,7)}) - ${dep.timestamp}`
);
// Also return the raw SHAs if needed for later steps
const shas = deployments.map(dep => dep.sha);
return {
totalDeployments: deployments.length,
deployments: deployments, // full objects if you need them
shas: shas, // raw SHA values
shasFormatted: shasFormatted // flat string array for dropdowns
};
/*#CORTEX_IGNORE*/}
Save the block.
User input
This block asks the user for the commit ID to roll back to, pulling the commit IDs from the previous step.
Click + in the center of the page. In the block library modal, select the User input block.
In the side panel, enter a name and unique slug for the block.
In this example we use the name
User input
and the sluguser-input
.
Click +Add user input. Add the following:
Name: Commit ID to rollback to
Key: commit-id-to-rollback-to
Type: Select
Data source: Manual
Click Add input.
Path to override value: Enter
actions.javascript.outputs.result.shasFormatted
.
Save the block.
JavaScript
This block uses JavaScript to extract the SHA that was selected in the previous step.
Click + in the center of the page. In the block library modal, select the JavaScript block.
In the side panel, enter a name and unique slug for the block.
In this example, we use the name
Extract SHA
and the slugextract-sha
.
In the JavaScript text editor box, enter an expression. Our example uses the following:
/*#CORTEX_IGNORE*/import {_, YAML, hcl, context, actions, variables, fetch} from './utils.js'; async () => {
// The selected option from the User Input block
const selectedOption = actions['user-input'].outputs['commit-id-to-rollback-to']
// Extract the text between parentheses
const match = selectedOption.match(/\((.*?)\)/);
const sha = match ? match[1] : null;
return {
selectedOption: selectedOption, // full string
sha: sha // extracted commit SHA
};
/*#CORTEX_IGNORE*/}
Save the block.
Trigger GitHub workflow
This block triggers a GitHub workflow to roll back the deployment.
Click + in the center of the page. In the block library modal, select the GitHub > Trigger workflow block.
In the side panel, enter a name and unique slug for the block. In this example, we use the name
Trigger workflow
and the slugtrigger-workflow
.Configure the block:
Repository: Enter your repository name.
Ref: Enter the Git reference to trigger the workflow on, e.g.
main
.Workflow ID or file name: Enter the ID of the GitHub workflow or the file name of the workflow file.
In our example, we added
rollback.yaml
.
Inputs: Optionally enter key-value pairs to pass to the workflow.
In our example, we added:
{ "service": "{{context.entity.tag}}", "rollback_to": "{{actions.extract-sha.outputs.result.sha}}" }
At the bottom of the side panel, click Save.
Branch
During the first User input step, the user can choose whether or not to create a Slack channel for the incident. Their choice determines which path will be followed during the Branch block.
If they chose to create an incident Slack channel:
Click + in the center of the page. In the block library modal, choose Branch.
In the block configuration side panel, enter a name and unique slug for this block.
In this example, we use the name
Branch
and the slugbranch
.
Click +Add path. Configure the conditional path:
Name: Create incident Slack channel
Slug: create-incident-slack-channel-path
Path expression:
actions["get-incident-info"].outputs["create-incident-slack-channel"] == true
Save the path.
Click +Add path. Configure the conditional path:
Name: Do not create Slack channel
Slug: do-not-create-slack-channel
Path expression:
actions["get-incident-info"].outputs["create-incident-slack-channel"] == false
Save the path.
Save the block.
Add blocks to the "Create incident Slack channel" path
HTTP requests
Click + under the new path. In the block library modal, choose HTTP request.
In the block configuration side panel, enter a name and unique slug for this block.
In this example, we use the name
Create incident Slack channel
and the slugcreate-incident-slack-channel
.
Configure the block:
HTTP method: POST
URL:
https://slack.com/api/conversations.create
Headers:
Content-Type: application/json
Authorization: Bearer {{context.secrets.apiKey}}
Payload: In our example, we enter the following:
{
"name": "incident-{{actions.get-incident-info.outputs.incident-name}}"
}
Save the block.
Click + under the path. In the block library modal, choose HTTP request.
In the block configuration side panel, enter a name and unique slug for this block. In this example, we use the name
Send Slack message
and the slugsend-slack-message
.Configure the block:
HTTP method: POST
URL:
https://slack.com/api/conversations.create
Headers:
Content-Type: application/json
Authorization: Bearer {{context.secrets.apiKey}}
Payload: In our example, we enter the following:
{
"channel": "incident-{{actions.get-incident-info.outputs.incident-name}}",
"text": "Rollbacked service {{context.entity.tag}} to commit ID {{actions.extract-sha.outputs.result.sha}}."
}
Save the block.
Add block to the "Do not create Slack channel" path
Click + under the new path. In the block library modal, choose Slack > Send message.
In the block configuration side panel, enter a name and unique slug for this block.
In this example, we use the name
Send FYI message
and the slugsend-fyi-message
.
Configure the block:
Slack channel name: Select the channel where you want to send a message informing the team that you did not create a separate Slack channel for the incident.
Message text: In our example, we set this to:
Service {{context.entity.tag}} with incident does not have a separate Slack channel.
Save the block.
Step 3: Run the Workflow
At the top of the page, click Run Workflow.
When you run the Workflow, the following events happen:
The Workflow pauses to collect a response from the user during the User Input block. The user enters a name and severity for the incident, and chooses whether to create a Slack channel
Some incidents require additional team work and collaboration, while others can be easily mitigated and may not require a dedicated channel.
The "List deployments for entity" block runs, fetching a list of deployments associated with the entity.
The JavaScript block runs, which formats the deployment data.
The next User Input block runs, which uses the formatted data from the previous block to provide a commit ID to rollback to.
The second JavaScript block runs, which extracts the SHA from the output of the previous block.
The GitHub workflow is triggered to roll back the affected service.
The Branch block runs:
If the user selected to create a Slack channel: An HTTP request runs to create an incident Slack channel, and an HTTP block runs to send a Slack message into the channel including the incident name, the service being rolled back, and the commit ID.
If the user selected to not create a Slack channel: The "Do not create Slack channel" path runs. A Slack block runs, which sends a Slack message to an existing team channel to let them know a separate incident channel was not created.
Last updated
Was this helpful?