Skip to main content

Datadog

CatalogDiscoveryScorecards

Datadog is an application performance monitoring platform that provides real-time observability into entities, servers, databases, and tools, providing developers with a comprehensive understanding of their infrastructure as well as the ability to identify areas for improvement.

Cortex is uniquely equipped to augment Datadog's tools, providing greater visibility into your entities. In this guide, you'll learn how to set up the Datadog integration to pull in metrics for entities:

  • Monitors
  • SLOs
  • Dependencies

Setup and configuration

Getting started

In order to connect Cortex to your Datadog instance, you’ll need to create both a Datadog application key and an API key.

You can create both keys from the Organizational Settings page in Datadog. You can add an application key from the Applications Key tab, and you can create an API key from the API keys or Client Tokens tab.

Configuration

Once you've created both keys, you can create a configuration from the Datadog page in Settings.

From the Settings page, select Add Datadog configuration. This will open a modal window where you can enter details about the config.

  • Account alias: The name that Cortex will associate a given configuration with.
  • App key: The application key you generated in Datadog.
  • API key: The API key you created in Datadog.
  • Region: Dropdown with available Datadog regions.
  • Custom subdomain: Corresponds to a custom subdomain for your Datadog instance; this only takes the subdomain, not the entire URL. For example, this field would take cortex-docs from https://cortex-docs.datadoghq.com.
  • Environments: Optional environment tags for Datadog entities.
caution

If you do not see the settings page you're looking for, you likely don't have the proper permissions and need to contact your admin.

The Datadog integration has multi-account support so you can add a configuration for each additional by repeating the above process.

Each configuration requires an alias, which Cortex uses to correlate the designated with registrations for various entities. Registrations can also use a default configuration without a listed alias.

You can edit aliases and default configurations from the Datadog page in settings. Select the edit icon next to a given configuration and toggle "Set as default" on. If you only have one configuration, it will automatically be set as the default.

Registration

Discovery

By default, Cortex will use the entity tag (e.g. my-entity) as the "best guess" for the Datadog tag. For example, if your entity tag is my-entity, then the corresponding tag in Datadog should also be my-entity.

If your Datadog tags don’t cleanly match the Cortex entity tag, you can override this in the Cortex entity descriptor.

Entity descriptor

If you need to override automatic discovery, you can define the following block in your Cortex entity descriptor.

x-cortex-apm:
datadog:
serviceTags:
- tag: entity
value: brain
alias: my-default-alias
- tag: entity
value: cerebrum
alias: my-other-alias

| Field | Description | Required | | tag | Tag for the project in Datadog | | | value | Value for the project; Cortex will find monitors and SLOs by querying tag:value OR tag:value2 ... | | | alias | Alias for the Datadog configuration in Cortex (only relevant if you have opted into multi-account support) | |

tip

These tags are used to "discover" your monitors and SLOs. Cortex will find monitors and SLOs by querying tag:value OR tag:value2 ...

If you want to hard code and/or override discovery, you can define a monitors or SLOs block in the entity descriptor.

Monitors and SLOs

Adding monitors let you see information about their current status directly from a catalog - via the Monitors column - and under the Operations section of an entity page. You can find your monitors from Datadog's Manage Monitors page.

The ID of a monitor is found in the URL when you click on a monitor in your Datadog dashboard i.e., https://app.datadoghq.com/monitors/**<MONITOR_ID>**.

info:
x-cortex-apm:
datadog:
monitors:
- id: 12345
alias: my-default-alias
- id: 67890
alias: my-other-alias

Like monitors, Datadog SLOs can be found in the Operations section of an entity page. You can find the SLOs for your instance on Datadog's SLO status page.

The ID for the SLO can be found in the URL when you click on an SLO in the Datadog dashboard. For example, https://app.datadoghq.com/slo?slo_id=**<SLO_ID>**&timeframe=7d&tab=status_and_history.

info:
x-cortex-slos:
datadog:
- id: 0b73859a3e2504bf09ad23a161702654
alias: my-default-alias
- id: 228499184a9efe34d4e4e9df838c7fa1
alias: my-other-alias

Monitors and SLOs have the same field definitions.

FieldDescriptionRequired
idDatadog ID for the monitor or SLO
aliasAlias for the configuration in Cortex (only needed if you have opted into multi-account support)

Dependency mapping

Cortex automatically syncs dependencies from Datadog's Service Map, using the entity identifier (x-cortex-tag) to map entities found in the Service Map.

The relationships Cortex discovers through the integration will feed directly into the Relationships graph, so you can easily visualize the connections between your entities.

If you have two entities - for example, entity-one and entity-two - that have a dependency edge in Datadog's Service Map, both entities should exist in Cortex with the same entity identifiers.

You can override this by defining entity tags where tag = entity and value = entity name in Datadog Service Map.

caution

If the entity tag in Cortex does not exactly match the entity identifier in Datadog, the dependencies will not automatically sync. You can override automatic discovery by defining values in the entity descriptor.

Expected results

Entity pages

With the Datadog integration, you'll be able to find monitors and SLOs on an entity's home page. High-level information about monitors and SLOs will appear in the Overview tab.

In the Operations tab, you can find more detailed data about both monitors and SLOS. Both sections include Pass and Fail blocks; you can also see Warning and No Data blocks for monitors.

Clicking any block with a nonzero value will open a modal with more detailed information. The monitor modals will list all monitors with the applicable status. The SLO modals will also display targets for each SLO that is passing or failing.

From the Integrations tab in the sidebar, you can open the Datadog page to find all SLOs and monitors. The SLOs column will show each SLO, its target(s), and the current value for that entity. The Monitors column will show the title for each monitor, its query (if available), and a tag that indicates whether the entity is passing, failing, has a warning, or has no data.

Scorecards and CQL

With the Datadog integration, you can create Scorecard rules and write CQL queries based on Datadog metrics, monitors, and SLOS.

You can read more about Datadog's metrics and custom metrics in their docs.

Metrics

Timeseries data from Datadog.

  • Metric

  • Timestamp

    Definition: datadog.metrics(query: Text, lookback: Duration, alias: Text | Null)

Example

You can use the datadog.metrics() expression to evaluate the health of your entities in a Scorecard:

datadog.metrics(query="system.cpu.usage{service:" + datadog.serviceNames().join(" OR service:") + "}",lookback=duration("P2D")).averageBy((point) => point.metricValue) < 0.10

This rule makes sure that a given entity's average CPU usage is less than 10% over the last two days.

Monitors

Monitors associated with a given entity via ID or tags. You can use this data to check whether an entity has monitors associated with it, or whether an entity has the right types of monitors.

  • Created at

  • Creator email

  • Creator name

  • Name

  • Overall state

  • Query

  • Tags

  • URL

    Definition: datadog.monitors()

Example

For a Scorecard focused on operational maturity, this expression can be used to make sure an entity has at least one Datadog monitor set up:

datadog.monitors().length >= 1
SLOs

SLOs associated with a given entity via ID or tags. You can use this data to check whether an entity has SLOs associated with it and if those SLOs are passing.

  • History

  • ID

  • Name

  • Operation

  • Remaining budget

  • SLI value

    • Datum
    • Timeseries
  • SLO target

  • Source

  • Thresholds

    • Name
    • Threshold

    Definition: slos()

Examples

For a Scorecard focused on operational maturity, this expression can be used to make sure an entity has associated SLOs in Datadog:

slos().length > 0

This rule checks that there is at least one SLO is set up. While this rule makes sense in a Scorecard's first level, a rule checking the status of the SLO would make sense in a higher level:

slos().all((slo) => slo.passing)

Entities will pass this rule if all SLOs associated with it have "passing" status.

Discovery audit

Cortex will pull recent changes from your Datadog environment into the Discovery audit. Here, you can find new entities in Datadog that have not been imported into the catalog - these will have the tag New APM Resource - as well as entities in the catalog that no longer exist in Datadog - these will have the tag APM Resource Not Detected.

Background sync

The dependency sync runs automatically each day at 12 a.m. UTC, and can be run manually via the Sync dependencies button.

FAQs and troubleshooting

Can I set a Scorecard rule to monitor Datadog monitors/SLOs based on tags?

Yes, you can specify key-value pairs that allow Cortex to discover your SLOs and monitors, and use these tags in Scorecard rules.

How does Datadog work with other dependency sources?

When leveraging multiple dependency sources such as Datadog and a catalog entity's YAML, all the sources would be merged together and Cortex will de-duplicate the dependencies.

For example, if an entity YAML indicates X → Y and Datadog indicates X → Y and X → Z, the entity will display two edges presented as X → Y and X → Z.

Still need help?

The following are all the ways to get assistance from our customer engineering team. Please use the option that is best for your users:

  • Email: help@cortex.io, or open a support ticket in the in app Resource Center
  • Chat: Available in the Resource Center
  • Slack: Users with a connected Slack channel will have a workflow added to their account. From here, you can either @CortexTechnicalSupport or add a :ticket: reaction to a question in Slack, and the team will respond directly.

Don’t have a Slack channel? Talk with your customer success manager.