Skip to main content

Health Checks

Health checks emit 2 events

  • check.passed
  • check.failed
notification.yaml
apiVersion: mission-control.flanksource.com/v1
kind: Notification
metadata:
name: api-http-fail-alert
namespace: default
spec:
events:
- check.failed
filter: check.type == 'http'
title: API HTTP Check {{.check.name}} failing
body: |
## Check Failed
Error: {{.status.error}}
Failed at {{.status.created_at}}
to:
email: alerts@acme.com

Default Templates

check.passed

Title

{{ if ne channel "slack"}}Check {{.check.name}} has passed{{end}}

Template

{{ if eq .channel "slack"}}
{
"blocks": [
{{slackSectionTextMD (printf `:large_green_circle: *%s* is _healthy_` .canary.name)}},
{"type": "divider"},
{{ if .status.message}}{{slackSectionTextMD status.message}},{{end}}
{
"type": "section",
"fields": [
{{slackSectionTextFieldMD (printf `*Canary*: %s` .canary.name) }},
{{slackSectionTextFieldMD (printf `*Namespace*: %s` .canary.namespace) }}
{{if ne .agent.name "local"}}
,{{slackSectionTextFieldMD (printf `*Agent*: %s` .agent.name) }}
{{end}}
]
},
{{ if .check.labels}}{{slackSectionLabels .check}},{{end}}
{{ slackURLAction "View Health Check" .permalink "🔕 Silence" .silenceURL}}
]
}
{{ else }}
Canary: {{.canary.name}}
{{if .agent}}Agent: {{.agent.name}}{{end}}
{{if .status.message}}Message: {{.status.message}} {{end}}
{{labelsFormat .check.labels}}

[Reference]({{.permalink}})
{{end}}

check.failed

Title

{{ if ne channel "slack"}}Check {{.check.name}} has failed{{end}}

Template

{{ if eq channel "slack"}}
{
"blocks": [
{{slackSectionTextMD (printf `:red_circle: *%s* is _unhealthy_` .check.name)}},
{"type": "divider"},
{{ if .status.error}}{{slackSectionTextMD status.error}},{{end}}
{
"type": "section",
"fields": [
{{slackSectionTextFieldMD (printf `*Canary*: %s` .canary.name) }},
{{slackSectionTextFieldMD (printf `*Namespace*: %s` .canary.namespace) }}
{{if ne .agent.name "local"}}
,{{slackSectionTextFieldMD (printf `*Agent*: %s` .agent.name) }}
{{end}}
]
},
{{ if .check.labels}}{{slackSectionLabels .check}},{{end}}
{{ slackURLAction "View Health Check" .permalink "🔕 Silence" .silenceURL}}
]
}
{{ else }}
Canary: {{.canary.name}}
{{if .agent}}Agent: {{.agent.name}}{{end}}
Error: {{.status.error}}
{{labelsFormat .check.labels}}

[Reference]({{.permalink}})
{{end}}

Template Variables

FieldDescriptionScheme
agent

Details of the agent that created the config.

Agent

canary

canary

canary

check

Check

check

permalink

Link to the Catalog in mission control

string

status

check status

checkstatus

Agent

FieldDescriptionScheme
description

Short description of the agent

string

id

The id of the agent

uuid

name

The name of the agent

string

Canary

FieldDescriptionScheme
created_at

The created at of the canary

string

deleted_at

The deleted at of the canary

string

id

The id of the canary

uuid

labels

The labels of the canary

map[string]string

name

The name of the canary

string

namespace

The namespace of the canary

string

source

The source of the canary

string

updated_at

The updated at of the canary

string

Check

FieldDescriptionScheme
created_at

The created at of the check

time.Time

deleted_at

The deleted at of the check

time.Time

description

The description of the check

string

id

The id of the check

uuid

labels

The labels of the check

map[string]string

last_runtime

The last runtime of the check

time.Time

last_transition_time

The last transition time of the check

time.Time

latency

The past 1 hour latency summary

Latency

name

The name of the check

string

next_runtime

The next runtime of the check

time.Time

severity

The severity of the check

string

status

Check status details

string

transformed

Whether the check has been transformed

boolean

type

The type of the check

string

updated_at

The updated at of the check

time.Time

uptime

The past 1 hour uptime summary

Uptime

CheckStatus

FieldDescriptionScheme
check_id

The id of the check associated with this status

uuid

created_at

The created at of the check

time.Time

duration

The duration of the check

integer

error

The error of the check in case of failure

string

invalid

Whether the check errored out

boolean

message

The success message of the check

string

status

The status of the check

boolean

time

The time of the check

string

Uptime

FieldDescriptionScheme
failed

The number of checks that failed

integer

last_fail

The last time a check failed

time.Time

last_pass

The last time a check passed

time.Time

p100

The percentage of checks that passed

float64

passed

The number of checks that passed

integer

Latency

FieldDescriptionScheme
p95

The latency of the check

float64

p97

The latency of the check

float64

p99

The latency of the check

float64

rolling1h

The latency of the check

float64