Service Level Objectives

Per-surface availability targets, error budgets, and incident response.

Service Level Objectives

This page documents the availability and latency targets that the KuberCoin operations working group commits to for each public surface. These are objectives, not contracts — the project is community-operated and SLOs guide error-budget decisions, not legal liability.

Per-surface targets

Surface	Availability	Latency p95	Window
`www` (marketing)	99.9%	< 500 ms	30 days rolling
`docs`	99.9%	< 500 ms	30 days rolling
`explorer`	99.5%	< 1000 ms	30 days rolling
`wallet`	99.5%	< 1000 ms	30 days rolling
`rpc`	99.5%	< 1000 ms	30 days rolling
`open` (REST API)	99.5%	< 1000 ms	30 days rolling

Error budgets

An error budget is the complement of the availability target over the measurement window. A 99.5% target over 30 days yields a 3h 36m budget; a 99.9% target yields 43m 12s. When more than 50% of a surface's monthly budget is consumed in the first half of the window, the on-call rotation freezes non-urgent deploys for that surface until the burn rate drops below 1.0.

Measurement

Availability is computed from the http_requests_total Prometheus counter as 1 - sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])), evaluated every 30s and alerted on a 2/5/30/60-minute multi-window burn-rate ladder. Latency is read from the http_request_duration_seconds histogram via histogram_quantile(0.95, ...).

Incident response

SEV-1. Surface unavailable for > 5 minutes, or correctness regression. Page on-call. Public status update within 15 minutes.
SEV-2. Burn rate > 14.4 over 1h (will exhaust monthly budget in < 2 days). Open incident channel, no page.
SEV-3. Sustained burn rate > 1.0 over 6h. File issue, fix in next sprint.

Post-mortems for SEV-1 and SEV-2 incidents are published in /incidents/ within seven days, blame-free, with action items tracked publicly.