Incident Response in action
You've prepared your Cortex workspace for incidents — what can you do now to stay ahead of incidents? And what can you do when an incident does occur?
While Scorecards will help you prevent incidents by ensuring standards are met, there are still other potential ways an incident can occur. Learn more below on how you can promote the health of your data, trigger and handle incidents, and work through the Root Cause Analysis with Cortex.

Incident prevention in action
Gain visibility with Engineering Intelligence
View metrics in Eng Intelligence to understand how well teams are performing during and after incidents.
Incident Response in action
Cortex provides full context of your services, allowing you to take action, quickly mitigate incidents, and work through Root Cause Analysis.
Trigger an incident
After integrating with an incident management tool, you can trigger an incident directly from Cortex while viewing an entity's details page:

This is supported for PagerDuty, incident.io, FireHydrant, Rootly, and xMatters.
View entities with active incidents
While viewing a catalog, quickly see which entities have active incidents:

Example Incident Response approaches with Cortex
The following examples demonstrate how Cortex can help you navigate an efficient incident response.
Working through Root Cause Analysis (RCA) with Cortex
The examples above demonstrate how engineering teams can use Cortex to quickly investigate and mitigate an incident. Teams can also leverage Cortex's unified view of service metadata while working through the Root Cause Analysis (RCA) after an incident:
Reconstruct what happened without digging across fragmented tools:
Review the event timeline on the entity details page to understand the deploys that occurred before and during the incident and any other changes that may have occurred.
Review dependencies for the affected entity, giving insight into impact and potential causes of the incident.
Review the service's reliability posture:
Is there a Scorecard in place to measure and enforce reliability? If not, you can implement one across all of your services, ensuring the prevention of similar incidents from happening again in the future.
Last updated
Was this helpful?