Kubernetes
Kubernetes is a container orchestration system that automates software deployment, scaling, and management. By integrating Kubernetes with Cortex, you can gain deep visibility into your infrastructure and how services are actually deployed.
Setup and configuration
Cortex k8s agent
In order to connect Cortex to your Kubernetes instance, you’ll need to install the Cortex k8s agent in your Kubernetes cluster. The agent is lightweight and adds negligible impact to your cluster.
The k8s agent collects information from your cluster, like the current list of deployments, and uses a Cortex API key to send the information back to Cortex where it is exposed in catalogs, Scorecards, and other tools.
Before you begin, email the Cortex customer engineering team for the Helm chart used for deployment and a username/password.
- Create a Docker image pull secret:
kubectl create secret docker-registry cortex-docker-registry-secret \
--docker-server=ghcr.io \
--docker-username={provided by Cortex} \
--docker-password={token provided to you by the Cortex team} \
--docker-email={email address} - Generate and copy a Cortex API key from the API keys page in Authentication and access settings. The API key should have the
User (edit catalog entities)
role at a minimum. - Run the following command with the generated API key to create a secret in your cluster:
kubectl create secret generic cortex-key --from-literal api-key={your API key}
- Install the Helm chart provided by Cortex with the following command:
helm install {any name to assign to the installed helm chart} ./helm-chart
Note that these instructions are also available from Kubernetes settings in Cortex.
Security
The Cortex k8s agent uses a push model that ensures you do not need to expose your cluster to the public internet.
Additionally, the Helm chart comes with a predefined ClusterRole
that provides the correct RBACs:
- Permissions:
["get", "watch", "list"]
- Resources:
["deployments", "services", "pods", "replicationcontrollers", "statefulsets", "rollouts", "cronjobs"]
- API groups:
["apps", "argoproj.io", "batch"]
Communication out of the cluster to Cortex happens over HTTPS. There is no inbound traffic to the agent.
Configuration
By default, Cortex uses entity tags to map entities to Kubernetes resources. You can customize auto-mapping and annotation mapping in Kubernetes settings.
If you do not see the settings page you're looking for, you likely don't have the proper permissions and need to contact your admin.
K8s annotation mapping customization
Cortex maps Kubernetes deployments with a cortex.io/tag
annotation to Cortex entities with the same tag.
You can customize how Cortex maps Kubernetes annotations in deployment metadata from the K8s annotation mapping customization section.
Let's say, for example, your deployment.yaml
includes my.service
as the cortex.io/tag
:
metadata:
name: my-name
namespace: my-namespace
annotations:
cortex.io/tag: my.service
If this deployment should be mapped to a Cortex entity with the tag my-entity
, you can enter the following JQ expression to convert all periods in the deployment annotation tag to dashes:
.metadata.annotations."cortex.io/tag" | gsub("\\."; "-")
K8s auto-mapping customization
You can override entity tag discovery and have Cortex discover Kubernetes resources using their metadata labels instead.
Under the K8s auto-mapping customization section in Kubernetes settings, you can specify a list of metadata label keys.
Once the list is saved, Cortex will discover all Kubernetes resources with metadata labels that include a key in the list where the value equals a Cortex entity tag.
For example, let's say you have two Cortex entities (example
and entity
), and the following and a Kubernetes JSON blob:
{
"name": "Sample Kubernetes resource",
"metadata": {
"labels": {
"app": "example",
"another": "entity"
}
}
}
By default, example
and entity
will have no Kubernetes resource mappings. If the list of metadata labels is set to ["app"]
, then entity example
will be associated with "Sample Kubernetes resource." If the list is set to ["app", "another"]
, then both example
and entity
will be associated with the resource.
Registration
Discovery
By default, Cortex will use the entity tag (e.g. my-entity
) as the "best guess" for Kubernetes resource. For example, if your entity tag is my-entity
, then the corresponding resource in Kubernetes should also be my-entity
.
If your Kubernetes resource don’t cleanly match the Cortex entity tag, you can override this in the Cortex entity descriptor.
Entity descriptor
If you need to override automatic discovery, you can define one of the following blocks in your Cortex entity descriptor.
Cortex accepts several k8s resources, which can be on different clusters or of different types: deployments, ArgoCD rollout, StatefulSet, and CronJob.
All of these resource types have the same field definitions:
Field | Description | Required |
---|---|---|
identifier | namespace/name as found in Kubernetes | ✓ |
cluster | The name of the cluster, which is set when deploying the agent |
Deployments
x-cortex-k8s:
deployment:
- identifier: namespace/name
cluster: dev
- identifier: experiment/scratch
cluster: dev
- identifier: default/cortex
cluster: prod
ArgoCD Rollout
x-cortex-k8s:
argorollout:
- identifier: namespace/name
cluster: dev
StatefulSet
x-cortex-k8s:
statefulset:
- identifier: namespace/name
cluster: dev
CronJob
x-cortex-k8s:
cronjob:
- identifier: namespace/name
cluster: dev
Annotation
You can link your Kubernetes deployment to a Cortex entity by adding an annotation to your k8s deployment metadata.
Use cortex.io/tag
as the key and use the value of x-cortex-tag
in the Cortex entity's cortex.yaml
as the value.
For example, if the cortex.yaml
file is:
openapi: 3.0.1
info:
title: My Service
x-cortex-tag: my-service
x-cortex-type: service
description: This is my cool service.
Then the deployment.yaml
file should be configured as:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-name
namespace: my-namespace
annotations:
cortex.io/tag: my-service
Expected Results
Entity pages
Kubernetes deployment data will be available in the Operations block on the details pages for entities imported from Kubernetes or linked to a k8s resource.
With the integration enabled, you'll also be able to find a Kubernetes block under the Operations tab. This includes deployments, clusters, active replicas, and pending deployments.
From the Kubernetes page in Integrations, you can find more detailed information about the linked resource:
- Replicas: Number of available, ready, and desired replicas.
- Containers: Resource containers, including requested memory, memory limit, and CPU data. Also includes the full container definition.
Scorecards and CQL
With the Kubernetes integration, you can create Scorecard rules and write CQL queries based on Kubernetes resources.
See more examples in the CQL Explorer in Cortex.
Cluster information
Data about k8s clusters associated with a given entity.
Definition: k8s.clusters()
Examples
You can use the k8s.clusters()
expression in the Query Builder to find all clusters that start with "dev":
k8s.clusters.all((cluster) => cluster.name.matches("""dev-.*"""))
Or any cluster named "prod":
k8s.clusters.any((cluster) => cluster.name.matches("prod"))
Deployment labels
Checks deployment metadata.
Definition: k8s.metadata()
Examples
You can use this expression in a production readiness Scorecard to check ownership:
k8s.metadata().labels.any((label) => label.get("ownership") == "ownership_team")
This rule checks an entity's metadata labels for the ownership annotation and will pass if "ownership_team" is defined.
You can also use this expression in the Query Builder to find all k8s deployments with the label "environment":
k8s.metadata().labels.all((label) => label.containsKey("environment")) == true
Or you could refine the query further to find k8s deployments with an "environment" label and that are in production:
k8s.metadata().labels.all((label) => label.get("environment")?.matches("prod")") == true
K8s resource is set for entity
Checks whether a k8s resource of any type is associated with an entity.
Definition: k8s != null
Example
For a Scorecard focused on automation or development maturity, you can set a rule to make sure a k8s resource is mapped:
k8s != null
Kubernetes spec YAML
The Cortex k8s agent periodically sends the raw spec definitions for all entities. The spec JSON is equivalent to the root spec field of the entity descriptor (deployments, StatefulSet, etc.) and fully conforms to that format.
You can find the official documentation for these resource objects in the Kubernetes Workload Docs.
You can use this list of JSON specs combined with jq or Open Policy Agent (OPA) language to write complex assertions such as "all resources must have specific annotations set" or "all containers should have a CPU resource limit defined."
The list of JSON specs can also be filtered to only ones in a specific cluster by specifying the cluster name: k8s.spec("prod")
.
Definition: k8s.spec()
Examples
You can use this expression to write a wide range of rules. For a best practices Scorecard, you can make sure that resource definitions have set CPU requests:
jq(k8s.spec(), ".[].template.spec.containers[].resources.requests.cpu") != null
Or that all resource definitions expose only TCP ports:
Replica information
Number of replicas available, current, desired, ready, unavailable, or updated.
Definition: k8s.replicas()
Example
You can use this expression in a development maturity Scorecard to make sure an entity has at least two available instances:
k8s.replicas().numAvailable >= 2
Background sync
The Cortex k8s agent is essentially a simple cron job that runs every 5 minutes by default.
Limitations
FAQs and troubleshooting
When I try to import entities, I don't see all the supported workload types (deployments, ArgoCD rollout, StatefulSet, CronJob)
Make sure that the types you expected to see are in the cluster you are attempting to import.
Missing namespaces from Kubernetes discovery
If you're using Cortex's k8s agent to import entities into Cortex but don't see all expected namespaces during the import process, make sure app.namespace
is commented out in values.yaml
:
app:
# baseURL:
baseURL:
keySecret:
# namespace: exampleNamespace
If app.namespace
is defined the Cortex k8s agent will only be able to discover services from that namespace. This behavior can be confirmed with a backend log similar to:
INFO 1 --- [ scheduling-1] k8sagent : Looking for stateful sets in namespace <app.namespace>
Once app.namespace
is commented out, restart your pods. You will then be able to see all expected namespaces when importing new services.
Helm chart and deprecated Kubernetes Docker registry
If your Cortex agent in Kubernetes clusters is blocked due to deprecation of Docker registry after an upgrade, you can make these direct edits using the same credentials:
- Access the image from
ghcr.io
instead ofdocker.pkg.github.com
.image: ghcr.io/cortexapps/k8s-agent...
- Update the registry secret, setting the server to
https://ghcr.io
.
If you are unable to make these changes, please reach out to help@cortex.io and request a new Helm chart with this change already reflected.
Failing ArcoCD rollouts error in the k8s agent
When running the self-hosted Kubernetes agent successfully, users may see failing ArgoCD rollouts errors while not using this tool.
Error polling argocd rollouts from Kubernetes API
io.kubernets.client.openapi.ApiException:
[...]
at com.brainera.k8sSDKClient.getArgoRollouts(k8sClient.kt:101) ~[app:/na]
Cortex logs this exception for verbosity - this error is harmless if not using ArgoCD tool.
Can I deploy on prem if I don’t use Kubernetes?
Yes - the Cortex Helm chart deploys two Cortex-specific pods from images for the frontend and backend, as well as a data store. You can use these images to run Docker containers on other platforms, such as ECS.
Still need help?
The following are all the ways to get assistance from our customer engineering team. Please use the option that is best for your users:
- Email: help@cortex.io, or open a support ticket in the in app Resource Center
- Chat: Available in the Resource Center
- Slack: Users with a connected Slack channel will have a workflow added to their account. From here, you can either @CortexTechnicalSupport or add a
:ticket:
reaction to a question in Slack, and the team will respond directly.
Don’t have a Slack channel? Talk with your customer success manager.