Service Level Objectives (SLOs)

A practical guide to monitoring what matters to your users — without drowning in alerts.

What is an SLO?

A Service Level Objective is a target you set for how reliable a service should be, measured over a window of time.

Example: "My API should be available 99.9% of the time, measured over the last 30 days."

That single sentence contains three numbers that drive everything else:

  • What you measure ("availability")
  • How well it must perform ("99.9%")
  • Over how long ("rolling 30 days")

SLOs convert vague reliability goals like "keep it up" into concrete numbers. They tell you when to relax ("we're at 99.95% — under target, all good") and when to focus on stability instead of features ("we've burned the entire monthly budget — feature freeze").

If you've never set an SLO before, skip ahead to Getting started — the rest of this page is reference material you can return to.


Three letters that get confused

TermWhat it isExample
SLI — Service Level IndicatorThe raw metric you measure"% of HTTP checks returning 2xx within 30 seconds"
SLO — Service Level ObjectiveThe internal target you set on an SLI"availability ≥ 99.9% over rolling 30 days"
SLA — Service Level AgreementA contractual promise to a customer with financial penalty if missed"99.5% monthly uptime, otherwise 10% account credit"

The relationship: SLI is what you observe → SLO is what you commit to internally → SLA is what you promise customers.

In practice the SLO target is stricter than the SLA target. If your customer SLA is 99.5% monthly, your internal SLO might be 99.9% rolling 30-day. The gap is your buffer — you have time to fix issues before missing the contractual SLA.


How Enori SLOs work

Each SLO you create on a monitor tracks one signal continuously. Enori does three things in the background:

  1. Measures the chosen signal (uptime, latency, or content match) on every check
  2. Computes how much "error budget" you have left in the current window
  3. Alerts you if you're burning the budget faster than you can afford

You can create multiple SLOs per monitor. Common setup for a production API:

PurposeSLITargetWindow
Engineering healthAvailability99.9%rolling 30 days
User experienceLatency (<500ms)95%rolling 28 days
Customer SLAAvailability99.5%calendar monthly

The first two drive day-to-day decisions. The third generates customer-facing reports.


The three SLI types

Enori supports three indicators today. Pick the one that matches what your users actually feel.

Availability

% of checks that succeeded.

A check succeeds when it returns a 2xx status code, completes within timeout, and passes any keyword/SSL/content rules you've configured.

Use it for: general health monitoring. "Is my service up?" This is the default and the right starting point for most teams.

Latency

% of checks completed under a threshold (e.g. 500ms).

You set the threshold in milliseconds. A check is "good" if its time-to-first-byte is at or below that threshold; otherwise it's "bad" — even if the request succeeded.

Use it for: UX-sensitive services where a slow response is as bad as a failure. Checkout pages, search APIs, dashboards.

Content match

% of checks where the configured keyword/regex matched in the response body.

You configure the keyword on the monitor itself. The SLO tracks how often that keyword shows up.

Use it for: verifying business logic stays correct. "Does the homepage still show 'Welcome'?" / "Is the API returning the expected JSON shape?"


Window types: rolling vs calendar

The window is the period over which the SLO is computed. Two flavors:

Rolling window

"The last N days, always."

A rolling window slides every minute. "Rolling 30 days" always means "now − 30 days → now." Old incidents drop off the back as time moves forward.

Why use it: continuously-current view of reliability. Engineers prefer this because it shows the actual recent state of the service. Default for most teams.

Available lengths: 7, 28, 30, 90, or 365 days.

Calendar window

"The current month / quarter / year."

A calendar window is anchored to clock boundaries. "Calendar monthly" runs from the 1st of the month at 00:00 UTC to the 1st of the next month. The window resets cleanly on each boundary.

Why use it: SLA contracts. Customer agreements typically read "99.5% per calendar month", not "99.5% rolling 30 days". Calendar windows make SLA reports straightforward — there's a clean before/after on each boundary.

Available cycles: weekly (Monday-aligned), monthly, quarterly, yearly.

Tip: create both. A rolling SLO for engineering insight + a calendar SLO matching your customer agreement is a common combo.


Reading the SLO card

When an SLO is active, you'll see this on the monitor's detail page:

text
─────────────────────────────────
 99.93% achieved · 99.9% target
 ████░░░░░░░  42% burned
 25m budget left
 0.6× burn rate · safe
─────────────────────────────────

What each number means:

FieldMeaning
AchievedActual measured % over the window. Higher is better.
TargetThe level you committed to.
Burned %How much of the error budget is spent.
Budget leftHow much "bad time" you can still afford before missing the SLO.
Burn rateMultiplier vs sustainable. 1.0× = exactly on pace. 0.6× = under pace (good). 5.8× = burning way too fast.

Status colors

  • 🟢 Green — under 60% of the budget burned. Healthy.
  • 🟡 Amber — 60% to 100%. Watch closely.
  • 🔴 Red — 100% or more. Budget exhausted.

Error budgets and burn rate

The error budget is the amount of "bad time" your SLO permits over the window.

If you target 99.9% availability over 30 days:

  • Window = 30 days × 86,400 seconds = 2,592,000 seconds
  • Allowed bad time = 2,592,000 × (1 − 0.999) = 2,592 seconds (≈ 43 minutes)

That's your budget. Every minute of downtime spends from it.

Burn rate is how fast you're spending the budget compared to the sustainable rate. The sustainable rate is the pace that would exhaust the budget exactly at the window's end — perfectly even consumption.

Burn rateWhat it means
0.0× – 0.5×Way under pace. You have headroom.
0.5× – 1.0×Slightly under pace. Healthy.
1.0×Exactly on the sustainable line.
1.0× – 6×Burning too fast. Slow-burn alert fires.
6× – 14.4×Major incident in progress. Fast-burn alert fires.
> 14.4×Catastrophe. Drop everything.

The thresholds (14.4× and 6×) come from Google SRE practice. They're chosen so that fast-burn at 14.4× would exhaust 30 days of budget in 1 hour, and slow-burn at 6× would do it in 6 hours.


Burn-rate alerts

When you create an SLO, three alert types are available. They route through the same alert channels you've configured on the monitor — email, Slack, Discord, Teams, webhook, push notifications. No extra setup.

AlertFires whenCadence
Fast burnBurn rate exceeds 14.4× sustainable for 1 hourChecked every minute
Slow burnBurn rate exceeds 6× sustainable for 6 hoursChecked every 5 minutes
Low budgetLess than 25% of the budget remainsOne-shot when threshold crossed

You can enable them individually using the Aggressive / Balanced / Relaxed preset or fine-tune them per-toggle.

When an alert fires, it opens an alert episode — a stateful incident that you can acknowledge, snooze, or resolve directly from the email, push notification, or Slack message. Episodes auto-close when conditions return to normal for 30 continuous minutes.


Maintenance and pause exclusion

By default, two kinds of "downtime" are excluded from the SLO:

ExclusionWhat it excludes
Maintenance windowsTime during a scheduled maintenance window declared on the monitor
Paused timePeriods when the monitor was paused (manually or by plan downgrade)

Both are on by default and recommended for almost every SLO. Planned downtime shouldn't burn customer-trust budget, and pausing a monitor shouldn't trick the SLO into thinking everything was healthy.

You can toggle either off when creating the SLO if you want strict counting (rare).


SLA reports for calendar SLOs

If you have a customer SLA that reads "99.5% per calendar month", create a calendar-monthly SLO with the matching target. Then on the SLO management page (/dashboard/monitors/{id}/slos) the Generate SLA report button activates.

The report is a printable HTML page (use your browser's Print → Save as PDF) containing:

  • Summary — target, achieved, status (MET / MISSED), total downtime
  • Incidents — list of incidents during the period with cause and resolution
  • Maintenance windows — declared windows that were excluded per agreement
  • Methodology — how availability was measured
  • Cryptographic checksum for verification

This report is what you send to your customer at month-end to demonstrate compliance.


Getting started

The fastest way to get value from SLOs is to set up three on your most important monitor.

1. Open the monitor

Navigate to Monitors → choose your most-trafficked monitor → detail page.

2. Click "+ Add SLO" on the SLO rail card

Look for the SLO card in the right rail. Click the "+" or use the SLO management page (/dashboard/monitors/{id}/slos).

3. Create three SLOs

Following this template:

SLO #1 — Availability (engineering target)

FieldValue
NameAvailability — internal
SLI typeAvailability
Target99.9
Window typeRolling
Window length30 days
Burn-rate alertsBalanced (Fast + Slow)
Exclude maintenance
Exclude paused

SLO #2 — Latency (UX target)

FieldValue
NameLatency — under 500ms
SLI typeLatency
Latency threshold500 ms
Target95
Window typeRolling
Window length28 days
Burn-rate alertsBalanced

SLO #3 — Calendar SLA (if you have a customer agreement)

FieldValue
NameCustomer SLA
Description (Rationale)e.g. "MSA §4.2 — monthly uptime commitment"
SLI typeAvailability
Target99.5 (matching your contract)
Window typeCalendar
Calendar cycleMonthly
Burn-rate alertsBalanced

4. Wait

The SLO will start measuring from the moment you create it. The rail card and management page will populate with achieved %, burned %, and burn rate as data accumulates. Initial readings might say "0% burned" simply because no checks have run yet — give it a few minutes.

5. Generate your first SLA report (if you set up SLO #3)

At the end of the month, navigate to the SLO management page and click Generate SLA report. Print to PDF, send to your customer.


FAQ

Do I need an SLO for every monitor?

No. Start with the 1–3 monitors that matter most to your business — typically the customer-facing API, the marketing site, and a critical internal service. Adding SLOs to less-important monitors creates noise without benefit.

What happens when an SLO turns red?

The SLO status changes to "Burning" in the rail card and the management page. If burn-rate alerts are enabled, a notification fires through your configured channels. The standard SRE response is a feature freeze on the affected service — focus engineering effort on stability until the budget recovers.

How do I read the rolling vs calendar windows when both exist?

You'll see them side-by-side on the management page. The rolling SLO updates continuously and reflects current state. The calendar SLO is your contract-aligned view; check it at month-end before sending the SLA report.

Why is my burn rate showing 0× when there's downtime?

Burn rate is computed as (burned / elapsed) ÷ (budget / window). If elapsed is very small (just-created SLO) or burned is zero, you'll see . Wait for the window to accumulate data.

What if my monitor was paused for a week?

If "Exclude paused" is enabled (the default), the paused time is removed from the SLO calculation. The SLO behaves as if those 7 days didn't exist. If "Exclude paused" is off, the paused time counts as downtime and burns budget.

Can I edit an SLO after creating it?

Yes. Click the SLO card on the rail or "Edit" button on the management page. All fields are editable. Changing the target retroactively re-evaluates the budget against the new value — there's no data loss.

How do I pause an SLO without deleting it?

Toggle the Enabled switch off in the edit modal. Tracking continues silently but the SLO is hidden from the rail card and skips alert evaluation. Useful during planned chaos tests or while diagnosing a noisy SLO.

Can I have multiple SLOs of the same type?

Yes. A monitor can have, for example, three Availability SLOs at 99%, 99.9%, and 99.99% targets. Each is evaluated independently. Useful for tiered alerting — the 99% SLO is your "absolute floor" alert; the 99.99% is your stretch goal.

What if I miss the SLA?

Practically: contact the customer per your contract terms. Operationally: the SLA report will show "MISSED" on the affected period. Use it as the basis for the credit calculation specified in your agreement. Fix the underlying issue, document the root cause, and consider tightening your internal SLO so you have more buffer next time.


Reference: limits and bounds

SettingAllowed values
Target %90.000 — 99.999 (three decimal places)
Latency threshold1 — 60,000 ms
Rolling window length7, 28, 30, 90, or 365 days
Calendar cycleweekly, monthly, quarterly, yearly
Description / rationaleUp to 500 characters
NameUp to 80 characters
SLOs per monitorNo hard limit (3–5 is typical)

  • API reference — for programmatic SLO management via /api/monitors/{id}/slos
  • Alert episodes — how Enori turns burn-rate alerts into actionable incidents
  • SLA reports — generating customer-facing compliance documents
  • Maintenance windows — declaring planned downtime that's excluded from SLOs

Last updated: 2026-05-09. Feedback or corrections: support@enori.io