Skip to content

Resilience Standard — Stability Patterns

v1.0 — 2026-06-06 — added from expert review (item 8: Nygard, Release It!).

The core rule

Never make an outbound integration call (HTTP, database, queue, cache, external API) without an explicit timeout. Library defaults are not decisions — HttpClient's 100-second default is an outage amplifier, not a policy.

Standard pipeline (.NET)

Use Microsoft.Extensions.Http.Resilience (Polly v8 under the hood) — AddStandardResilienceHandler() on every HttpClient registration gives the sanctioned stack in one line:

  • Timeout — per-attempt and total; tune per dependency, don't accept defaults blindly.
  • Retry with exponential backoff + jitter — only for idempotent operations; cap attempts (default 3); never retry non-idempotent mutations without an idempotency key.
  • Circuit breaker — failing dependencies fail fast instead of stacking up threads waiting; recovery is automatic via half-open probes.

Wrap this registration in an AddExpertGroupResilientHttpClient() extension in ExpertGroup.Core so services inherit consistent policies.

Beyond HTTP

  • Database: command timeouts set deliberately (CommandTimeout); long-running reports get their own context configuration, not a global raise.
  • RabbitMQ (Core.Ipc): consumers retry with backoff a bounded number of times, then dead-letter — a poison message must never block a queue.
  • Bulkheads: where one consumer can exhaust a shared resource (connection pools, named HttpClients), partition capacity so one client's failure can't sink the rest.
  • Fail fast: validate inputs and check circuit/dependency state before doing expensive work or taking locks.
  • Graceful degradation: define per integration point what the service does when the dependency is down (cached data, reduced functionality, clear error) — "500 everything" is not a strategy. Feature flags (Git standard) double as kill switches for degraded features.

Sources: Release It! 2nd ed. · Microsoft.Extensions.Http.Resilience · Polly v8