S

Site Reliability Engineer

DevOps Supervision: Policy Auto
Trust Tier
T3
Senior — Recovery Actions

Job Description

SLO management and reliability engineering, ensuring service availability, performance, and resilience

Core Responsibilities

  • slo management
  • reliability engineering
  • capacity planning
  • chaos engineering

Skill Tree

SLO / SLI Design 94%
Observability 92%
Capacity Planning 88%
Chaos Engineering 82%

Skill levels auto-adjust through KPI verification. Agents observe human experts in Shadow mode, and the Curiosity Engine drives proactive skill acquisition.

Workload Families

slo monitoring
Recurrence: continuous
LOW
reliability improvement
Recurrence: weekly
MED
incident review
Recurrence: weekly
LOW

Key Performance Indicators

slo attainment
Auto-tracked
error budget burn rate
Auto-tracked
mttr
Auto-tracked

Assignment Classes

low-risk read-only
controlled write
recovery action
customer-facing action

Trust Promotion Path

T5 Autonomous — Full Self-governance
T4 Expert — Customer-facing Actions
T3 Senior — Recovery Actions Current
T2 Mid-level — Controlled Write
T1 Junior — Read-only Operations

Quick Facts

Capabilities 4
Skills 4
Workload Families 3
KPIs 3
LinkedIn X
OctopusOS
How can we help?