# Reduce MTTA

Mean Time to Acknowledge (MTTA) measures how quickly teams acknowledge an incident after it is triggered.

Reducing MTTA depends on:

* Operational readiness: Ensuring the right people are reachable at all times.
* Response behavior: Tracking and improving how fast incidents are acknowledged once triggered.

A well-designed Scorecard shouldn't just track MTTA as a number; **it should validate the conditions that enable low MTTA**, such as on-call setup, contact methods, and escalation policy depth.

### Best practices

When creating a Scorecard aimed at reducing MTTA, follow these best practices:

* Group rules by functional area (e.g., Incident Response, Monitoring, Reliability) to simplify assessment.
* Keep evaluation windows aligned so that related signals trend together.
* Enable Cortex notifications for when overall Scorecard scores drop, prompting teams to review and act.

## Rules that focus on reducing MTTA

<table><thead><tr><th width="194.90106201171875">Category</th><th>Purpose</th><th>Example CQL expression</th></tr></thead><tbody><tr><td>On-call configuration</td><td>Ensure service has an active PagerDuty schedule</td><td><code>oncall != null</code></td></tr><tr><td>Contact reliability</td><td>Verify responders can be reached through multiple channels</td><td><code>oncall.usersWithoutContactMethods(allowed=["EMAIL","PHONE","PUSH_NOTIFICATION","SMS"]) == 0</code></td></tr><tr><td>Escalation depth</td><td>Require at least two escalation tiers</td><td><code>oncall.numOfEscalations() >= 2</code></td></tr><tr><td>Acknowledgment time</td><td>Track MTTA against defined thresholds</td><td><code>jq(oncall.analysis(...), '.meanSecondsToFirstAck &#x3C;= 300')</code></td></tr><tr><td>Monitoring coverage</td><td>Ensure critical services have active alerting rules</td><td><code>datadog.monitors().length > 0</code></td></tr><tr><td>SLO tracking</td><td>Validate that a service has defined SLOs and error budgets</td><td><code>slos().any((slo) => slo.name.matchesIn(".</code><em><code>Uptime.</code></em><code>"))</code></td></tr><tr><td>Ownership coverage</td><td>Require that every service has a defined owning team</td><td><code>ownership != null</code></td></tr><tr><td>Communication channel is set</td><td>Require that every service has a defined communication channel</td><td><code>slack != null and slack.numOfMembers() > 0</code></td></tr><tr><td>Entity was verified in the last 90 days</td><td>Require entity information, including ownership, on-call, and Slack to be verified in the last 90 days</td><td><code>verifications().lastVerifiedAt() != null and verifications().lastVerifiedAt().fromNow() > duration("P-90D")</code></td></tr><tr><td>Entity does not have pending verifications</td><td>Ensure entity does not have any pending verifications</td><td><code>verifications().verifications().any(verification => verification.status == "PENDING") == 0</code></td></tr></tbody></table>

## Examples from real Cortex users

The following anonymized examples come from real uses cases our customers are solving with Cortex.

### Event Readiness Scorecard

For companies that have a busy season (e.g., companies that are busier during Black Friday), they might create a [seasonal readiness Scorecard](/guides/production-readiness/seasonal-readiness-scorecard.md) in Cortex. The following strategy ties performance metrics directly to readiness controls:

* Track MTTA < 120 seconds for P1 and P2 incidents
* Require two-tier escalation policies in PagerDuty: \
  `oncall.numOfEscalations() >= 2`
* Validate that on-call users have valid contact methods configured: \
  `oncall.usersWithoutContactMethods(...) == 0`
* Combine outcome metrics (MTTA) with configuration checks to ensure teams can meet targets consistently.

### On-call Configuration Scorecard

The following strategy ensures every service can be reached before an incident occurs, eliminating the common MTTA outliers caused by misconfigured alerts:

* Verify that on-call rotations exist\
  `oncall != null`
* Validate that on-call users have valid contact methods configured:\
  `oncall.usersWithoutContactMethods(...) == 0`
* Flag services without assigned responders

This CQL applies to all on-call integrations: [Opsgenie](/ingesting-data-into-cortex/integrations/opsgenie.md), [PagerDuty](/ingesting-data-into-cortex/integrations/pagerduty.md), [Splunk On-Call (formerly VictorOps)](/ingesting-data-into-cortex/integrations/splunk-oncall.md), and [xMatters](/ingesting-data-into-cortex/integrations/xmatters.md).

### Operational Maturity Scorecard

The previous examples applies to services and other entities. Some organizations prefer to track team-level operational maturity, including incident management as one area of assessment.&#x20;

The following strategy aims for low MTTA as part of a broader operational maturity, not an isolated performance goal:

* Measure operational behaviors like post-incident reviews, ownership clarity, and alert hygiene.
* Focus on consistent response patterns rather than single-point metrics.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.cortex.io/guides/incident-mgmt/reduce-mtta.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
