Service Level Objectives
Per-surface availability targets, error budgets, and incident response.
Service Level Objectives
This page documents the availability and latency targets that the KuberCoin operations working group commits to for each public surface. These are objectives, not contracts — the project is community-operated and SLOs guide error-budget decisions, not legal liability.
Per-surface targets
| Surface | Availability | Latency p95 | Window |
|---|---|---|---|
www (marketing) | 99.9% | < 500 ms | 30 days rolling |
docs | 99.9% | < 500 ms | 30 days rolling |
explorer | 99.5% | < 1000 ms | 30 days rolling |
wallet | 99.5% | < 1000 ms | 30 days rolling |
rpc | 99.5% | < 1000 ms | 30 days rolling |
open (REST API) | 99.5% | < 1000 ms | 30 days rolling |
Error budgets
An error budget is the complement of the availability target over the measurement window. A 99.5% target over 30 days yields a 3h 36m budget; a 99.9% target yields 43m 12s. When more than 50% of a surface's monthly budget is consumed in the first half of the window, the on-call rotation freezes non-urgent deploys for that surface until the burn rate drops below 1.0.
Measurement
Availability is computed from the http_requests_total
Prometheus counter as
1 - sum(rate(http_requests_total{status=~"5.."}[5m])) /
sum(rate(http_requests_total[5m])), evaluated every 30s and
alerted on a 2/5/30/60-minute multi-window burn-rate ladder.
Latency is read from the http_request_duration_seconds
histogram via histogram_quantile(0.95, ...).
Incident response
- SEV-1. Surface unavailable for > 5 minutes, or correctness regression. Page on-call. Public status update within 15 minutes.
- SEV-2. Burn rate > 14.4 over 1h (will exhaust monthly budget in < 2 days). Open incident channel, no page.
- SEV-3. Sustained burn rate > 1.0 over 6h. File issue, fix in next sprint.
Post-mortems for SEV-1 and SEV-2 incidents are published in
/incidents/ within seven days, blame-free, with action
items tracked publicly.