Prometheus

Overview

Prometheus is an open-source monitoring and analytics platform that allows customers to analyze, visualize, automate, and alert on metrics data.

Integrating Cortex with Prometheus allows you to:

  • View SLO information from Prometheus on entity pages in Cortex

  • Create Scorecards that track progress and drive alignment on projects involving Prometheus SLOs

How to configure Prometheus with Cortex

There are two options for integrating Prometheus: the default configuration method and Cortex Axon Relay, a relay broker allows you to securely connect your on-premises Prometheus data.

Prerequisite

Before getting started, set up basic authentication credentials in Prometheus.

Configure the integration in Cortex

  1. In Cortex, navigate to the Prometheus settings page:

    1. In Cortex, click your avatar in the lower left corner, then click Settings.

    2. Under "Integrations", click Prometheus.

  2. Click Add integration.

  3. Configure the Prometheus integration form:

    • Account alias: Enter your account alias.

    • Username and Password: Enter your Prometheus basic auth credentials.

    • Host: Enter your self-managed Prometheus hostname.

    • Tenant ID: Optionally, enter your tenant ID.

      • If you have multiple tenants, you can enter an ID here to monitor a specific tenant.

  4. Click Save.

Configure the integration for multiple Prometheus accounts

The Prometheus integration has multi-account support. You can add a configuration for each additional by repeating the process above.

Each configuration requires an alias, which Cortex uses to correlate the designated with registrations for various entities. Registrations can also use a default configuration without a listed alias. You can edit aliases and default configurations from the Prometheus page in your Cortex settings. Select the edit icon next to a given configuration and toggle Set as default on. If you only have one configuration, it will automatically be set as the default.

Connecting Cortex entities to Prometheus

Linking SLOs in Cortex

You can create and manage SLOs by listing relevant SLIs through queries.

x-cortex-slos:
  prometheus:
    - errorQuery: sum(rate(http_server_requests_seconds_count{code=~"(5..|429)"}[5m]))
      totalQuery: sum(rate(http_server_requests_seconds_count[5m]))
      slo: 99.95
      alias: my-prometheus-instance # alias is optional and only relevant if you have opted into multi account support
      name: my-slo-name
Field
Description

errorQuery

Query that indicates error events for your metric.

totalQuery

Query that indicates all events to be considered for your metric.

slo

Target number for SLO.

alias

Ties the SLO registration to a Prometheus instance listed under Settings → Prometheus. The alias parameter is optional, but if not provided the SLO will use the default configuration under Settings → Prometheus.

name

The SLO's name in Prometheus. The name parameter is optional.

How Cortex calculates Prometheus SLOs

When Cortex gets an SLO from Prometheus, the following query is calculated for it:

(1 - ({errorQuery}) / ({totalQuery}))

This value is calculated and resolved on an hour window, and calculated back for 7 days. Cortex averages the value for each 1-hour window, then averages each of those hourly averages across the lookback period, before displaying it in your Cortex workspace.

The value is updated when the entity page is loaded and when Scorecards are evaluated.

View Prometheus data in entity pages

When an SLO is defined in an entity's descriptor, you'll see detailed data about SLOs in the Monitoring page in the sidebar of an entity details page. See the SLO query, target, the current value for each SLO, and the period of time the SLO is being calculated for. For example, if the time listed is "7 days ago," then the SLO is looking at the time range starting 7 days ago to now..

Scorecards and CQL

With the Prometheus integration, you can create Scorecard rules and write CQL queries based on Prometheus SLOs.

See more examples in the CQL Explorer in Cortex.

SLOs

SLOs associated with the entity via ID or tags. You can use this data to check whether an entity has SLOs associated with it, and if those SLOs are passing.

Definition: slos: List<SLO>

Examples

In a Scorecard, you can use this expression to make sure an entity is passing its SLOs:

slos().all((slo) => slo.passing) == true

Use this expression to make sure latency Service Level Indicator (SLI) value is above 99.99%:

slos().filter((slo) => slo.name.matchesIn("latency") and slo.sliValue >= 0.9999).length > 0

Still need help?

The following options are available to get assistance from the Cortex Customer Engineering team:

  • Email: [email protected], or open a support ticket in the in app Resource Center

  • Chat: Available in the Resource Center

  • Slack: Users with a connected Slack channel will have a workflow added to their account. From here, you can either @CortexTechnicalSupport or add a :ticket: reaction to a question in Slack, and the team will respond directly.

Don’t have a Slack channel? Talk with your Customer Success Manager.

Last updated

Was this helpful?