// Architecture Deep Dive

Resilience, Fault-Tolerance,
High Availability & Scalability

Comprehensive patterns, design principles, and implementation strategies for building production-grade distributed systems — covering everything from circuit breakers to petabyte-scale data pipelines.

☁ Cloud-Native Patterns ⚡ Big Data Architecture ⬡ Distributed Systems ∞ Infinite Scalability
🛡️

Resilience Patterns

Systems that recover gracefully from failures — absorbing shocks and returning to normal operation without data loss or prolonged outages.

Core Concepts
RTO
Recovery Time Objective — max acceptable downtime
RPO
Recovery Point Objective — max acceptable data loss
MTTR
Mean Time to Recovery — avg time to restore service
MTBF
Mean Time Between Failures — avg uptime between incidents
Circuit Breaker Pattern

The circuit breaker wraps calls to external dependencies. After a threshold of failures, it "opens" and fast-fails requests instead of waiting for timeouts — protecting the caller from cascade failures.

// Circuit Breaker State Machine
CLOSED
Normal ops
OPEN
Fast fail
HALF-OPEN
Probe
CLOSED
Recovered
Threshold exceeded → OPEN    After timeout → HALF-OPEN (try one request)    Success → back to CLOSED
Pattern

Circuit Breaker

Prevent cascading failures by short-circuiting calls to failing services after a configurable error threshold. Three states: Closed (normal), Open (fail fast), Half-Open (probe).

CircuitBreaker cb = CircuitBreaker.builder() .failureRateThreshold(50) // % failures .waitDurationInOpenState( Duration.ofSeconds(30)) .slidingWindowSize(10) .permittedCallsInHalfOpenState(3) .build();
  • Count-based or time-based sliding window
  • Slow call detection alongside errors
  • Event listeners for state transitions
  • Pattern

    Retry with Exponential Backoff

    Retry transient failures with increasing wait times between attempts, plus jitter to prevent thundering herd. Combine with circuit breaker for compound protection.

    const retry = async (fn, opts) => { for (let i = 0; i < opts.maxAttempts; i++) { try { return await fn(); } catch(e) { if (i === opts.maxAttempts - 1) throw e; const delay = opts.base * 2**i + Math.random() * 1000; // jitter await sleep(delay); } } };
    Jitter prevents thundering herd: multiple clients don't retry simultaneously after an outage, spreading load over time.
    Pattern

    Bulkhead Isolation

    Isolate components into separate thread pools, connection pools, or processes. A failure or saturation in one bulkhead cannot exhaust resources in another.

    Thread Pool Bulkheads
    Payment
    Pool: 20
    Search
    Pool: 50
    Report
    Pool: 5
    • Separate connection pools per downstream service
    • Semaphore-based bulkhead for lightweight isolation
    • Resource quota enforcement per tenant
    Pattern

    Timeout & Deadline Propagation

    Every outbound call must have a bounded deadline. In distributed systems, propagate context deadlines (gRPC context, HTTP headers) so all downstream services honor the same cut-off.

    // gRPC context propagation ctx, cancel := context.WithTimeout( parentCtx, 2*time.Second) defer cancel() resp, err := client.GetUser(ctx, req) if err == context.DeadlineExceeded { recordTimeout("GetUser") }
    Budget approach: Total deadline = T. Allocate T×0.4 to first hop, T×0.3 to next, leaving slack for serialization.
    Pattern

    Fallback & Graceful Degradation

    When a service fails, return a cached value, a default, or a reduced-feature response — rather than returning an error to the end user.

    • Stale cache fallback: Serve last known good response
    • Static default: Return empty list, zero count
    • Feature flag: Disable non-critical features
    • Local computation: Run simplified offline logic
    Pattern

    Idempotency & At-Least-Once Safety

    Design operations to be safe to execute multiple times. Use idempotency keys so that retried requests don't cause duplicate side effects.

    // Idempotency key in HTTP header POST /payments Idempotency-Key: "uuid-abc-123" // Server-side deduplication if seen(key) { return cachedResponse[key] } result := processPayment(req) store(key, result, ttl=24h) return result
    Event Sourcing & CQRS for Resilience
    📋

    Event Sourcing

    Store state as an ordered log of immutable events, not mutable rows

    Every state change is recorded as an event. State is derived by replaying the event log. This enables:

    • Complete audit trail for debugging incidents
    • Point-in-time state reconstruction
    • Easy recovery by replaying from last snapshot
    • Temporal decoupling between write and read models
    // Event store append events.append({ "id": "evt-001", "type": "OrderPlaced", "aggregate_id": "order-42", "timestamp": "2024-01-15T10:30Z", "data": { "items": [...], "total": 99.99 } }) // Rebuild state from events order = events .filter(e => e.aggregate_id == id) .reduce(applyEvent, initialState)

    Fault Tolerance

    Designing systems that continue operating correctly even when individual components fail — tolerating failures without impacting correctness or availability.

    Failure Taxonomy
    Failure Type Characteristics Detection Strategy Tolerance Approach
    Crash Fault Node stops responding entirely Heartbeat timeout, health check failure Replication, failover, leader election
    Omission Fault Messages lost in transit (receive or send) Acknowledgment timeout, seq number gaps Reliable messaging, ACK + retransmit
    Byzantine Fault Node sends incorrect/malicious data Cross-validation across replicas BFT consensus (needs 3f+1 nodes)
    Network Partition Split-brain — nodes can't communicate Quorum loss detection CAP trade-offs, fencing tokens
    Timing Fault Clock skew, delayed message delivery Logical clocks, hybrid logical clocks Vector clocks, Lamport timestamps
    Replication Strategies
    Replication

    Single-Leader (Primary-Replica)

    All writes go to one leader; replicas serve reads. Simple consistency model. Leader failure requires election (Raft, Paxos).

    👑 Leader
    →WAL sync→
    Replica 1
    →WAL sync→
    Replica 2
    • Sync replication: strong consistency, higher latency
    • Async replication: lower latency, risk of data loss on failover
    • Semi-sync: at least one replica confirms before ack
    Replication

    Multi-Leader (Multi-Master)

    Multiple nodes accept writes. Write conflicts must be resolved via Last-Write-Wins (LWW), CRDTs, or application-level merge.

    Conflict resolution strategies: LWW (Cassandra), CRDT merge functions (Riak), custom merge (Git), operational transformation (Google Docs).
    • Geo-distributed writes across regions
    • Offline-first clients (CouchDB)
    • Requires conflict detection (vector clocks)
    Replication

    Leaderless (Quorum)

    No designated leader. Client reads from W nodes, writes to R nodes. Consistency tunable via quorum: W + R > N.

    Formula: N = replicas, W = write quorum, R = read quorum
    Strong consistency: W+R > N. Availability: smaller W+R.
    • Read repair: fix stale nodes on read
    • Anti-entropy: background Merkle tree sync
    • Hinted handoff: store writes for unavailable nodes
    Consensus Algorithms
    Consensus

    Raft Algorithm

    Understandable consensus with explicit leader election. Used in etcd, CockroachDB, TiKV, and most modern distributed databases.

    // Raft leader election phases 1. Follower times out → becomes Candidate 2. Candidate increments term → sends RequestVote RPC 3. Majority vote received → becomes Leader 4. Leader sends heartbeats (AppendEntries) 5. Log entries committed when majority ack
    • Tolerates (N-1)/2 node failures
    • Log matching property ensures consistency
    • Leader completeness: elected leader has all committed entries
    Consensus

    Paxos (Multi-Paxos)

    The foundational consensus protocol. Complex but proven. Multi-Paxos elects a stable proposer to skip Phase 1 for subsequent instances.

    // Multi-Paxos phases Phase 1a: Prepare(n) // proposer Phase 1b: Promise(n,v) // acceptor Phase 2a: Accept(n,v) // proposer Phase 2b: Accepted(n,v) // acceptor + learner notified // Leader leasing avoids repeated phase 1
    Saga Pattern for Distributed Transactions
    🔄

    Saga — Distributed Transaction Coordinator

    Long-running transactions as a sequence of local transactions with compensating actions

    Choreography Saga: Each service publishes events and listens to others. Decentralized, no coordinator, but harder to track overall state.

    Orchestration Saga: A central saga orchestrator sends commands and handles failures with explicit compensation logic.

    Compensating transactions must be idempotent and semantically correct — e.g., "reverse-charge" instead of "delete payment".
    // Orchestration saga steps saga OrderSaga { steps: [ { action: reserveInventory, compensate: releaseInventory }, { action: chargePayment, compensate: refundPayment }, { action: scheduleShipment, compensate: cancelShipment } ] } // On step 3 fail → run // refundPayment + releaseInventory
    🔁

    High Availability

    Designing systems to achieve maximum uptime through redundancy, automated failover, health checking, and elimination of single points of failure.

    Availability Math
    99.9%
    Three Nines — 8.7 hrs/year downtime
    99.99%
    Four Nines — 52 min/year downtime
    99.999%
    Five Nines — 5.26 min/year downtime
    Series availability: A(total) = A₁ × A₂ × A₃ ... (each dependency multiplies risk)
    Parallel availability: A(total) = 1 − (1−A₁)(1−A₂) ... (redundancy adds availability)
    Deployment Topologies
    Topology

    Active-Passive Failover

    One active node handles all traffic; a passive standby is ready to take over. Automatic failover via health check + DNS cutover or floating IP.

    Load Balancer / VIP
    🟢 Active Node
    Serving Traffic
    🟡 Passive Node
    Standby / Syncing
    • Hot standby: continuously syncing, fast failover (<30s)
    • Warm standby: periodic sync, slower failover (1-5 min)
    • Cold standby: restore from backup, minutes to hours
    Topology

    Active-Active Clustering

    All nodes serve traffic simultaneously. Load balanced across the cluster. No wasted standby capacity. Requires shared state or distributed state management.

    Node A
    33%
    Node B
    33%
    Node C
    33%
    ↕ Distributed state sync (Gossip / Raft)
    • Full resource utilization — no idle standby
    • Session affinity or distributed session store required
    • Requires CAP trade-off decisions for shared state
    Topology

    Multi-Region Active-Active

    Deploy across multiple geographic regions. Route users to nearest region. Regions replicate data asynchronously, with conflict resolution for writes.

    • Global Traffic Manager / Anycast routing
    • Region-local writes with async cross-region replication
    • Conflict-free replicated data types (CRDTs)
    • Survives entire region failure transparently
    AWS pattern: Route 53 latency routing + DynamoDB Global Tables + Aurora Global Database.
    Health Checking & Load Balancing
    Health Check

    Layered Health Probe Design

    Health checks should verify the actual ability to serve traffic, not just "the process is running."

    // Three-tier health check GET /health/live → Is the process alive? "status": "up" GET /health/ready → Can it serve traffic? checks: db.ping(), cache.ping(), downstream.check() GET /health/startup → Finished initializing? checks: migrations_complete, config_loaded, warmup_done
    K8s probes: livenessProbe restarts pod; readinessProbe removes from service endpoints; startupProbe delays liveness checks.
    Load Balancing

    Load Balancing Algorithms

    AlgorithmBest For
    Round RobinHomogeneous, stateless
    Least ConnectionsLong-lived connections
    Least Response TimeMixed workload latencies
    IP Hash / ConsistentSession stickiness
    Resource BasedCPU/memory-aware routing
    Power of Two ChoicesLarge distributed systems
    Zero-Downtime Deployment Patterns
    Deploy

    Blue-Green Deployment

    Maintain two identical production environments (blue = current, green = new). Switch traffic atomically at the load balancer. Instant rollback by switching back.

    🔵 Blue v1.0
    ← Live traffic
    🟢 Green v2.0
    ← Idle / testing
    Deploy

    Canary Release

    Gradually shift traffic to the new version. Start with 1%, monitor error rates and latency, then incrementally increase: 1% → 5% → 20% → 100%.

    Metrics to gate: error rate, p99 latency, business KPIs (conversion, revenue). Auto-rollback if threshold exceeded.
    Deploy

    Rolling Update

    Replace instances one by one (or in batches). Always have N-k healthy instances serving traffic. Works with immutable infrastructure.

    # Kubernetes rolling update strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0
    📈

    High Scalability

    Designing systems that gracefully handle increasing load — from thousands to billions of requests — through horizontal scaling, partitioning, and async architectures.

    Scaling Dimensions
    Axis X

    Horizontal Scaling

    Add more identical instances behind a load balancer. Stateless services scale linearly. The simplest and most common scaling technique for web tiers.

    Requirement: Session state must be externalized (Redis, JWT) and all instances are functionally equivalent.
    Axis Y

    Functional Decomposition

    Split a monolith into microservices. Each service scales independently based on its own bottleneck — not every service needs 100 instances.

    Conway's Law: System structure reflects organizational communication structure. Align service boundaries with team boundaries.
    Axis Z

    Data Partitioning

    Shard data across nodes so each node owns a subset. Lookups route to the appropriate shard. Scale-out data layer for read and write throughput.

    Shard key choice is critical: avoid hot spots, ensure even distribution, minimize cross-shard queries.
    Caching Architecture

    Multi-Level Caching Hierarchy

    Latency: L1 CPU 1ns → L2 0.1μs → L3 10μs → DRAM 100μs → SSD 1ms → Network 10ms → HDD 10ms
    Client-Side
    Browser Cache
    HTTP Cache-Control
    Service Worker
    Offline cache
    ↓ Cache miss flows downstream ↓
    Edge / CDN
    CDN Edge Node
    Varnish / CloudFront
    API Gateway Cache
    Redis @ edge
    Application-Side
    In-Process L1
    Caffeine / LRU Map
    Distributed L2
    Redis Cluster
    DB Query Cache
    Materialized views
    ↓ Final cache miss ↓
    Storage
    Primary DB
    Object Store
    Search Index
    Database Sharding
    Sharding

    Consistent Hashing

    Map both data keys and nodes onto a virtual ring. Each key belongs to the nearest node clockwise. Adding/removing a node only migrates keys from adjacent neighbors — not a full reshuffle.

    // Consistent hash ring ring = SortedMap() for node in nodes: for v in range(VIRTUAL_NODES): h = hash(f"{node}:{v}") ring[h] = node def get_node(key): h = hash(key) return ring.ceiling(h) or ring.min()

    Virtual nodes (vnodes) balance load when node capacities differ.

    Sharding

    Range-Based Sharding

    Assign contiguous key ranges to shards. Simple to implement and supports range queries within a shard. Risk: hot spots if keys are temporally skewed (e.g., all writes go to the "latest" shard).

    Mitigation: Add a random prefix or suffix to time-based keys to distribute writes. Alternatively, use reverse-timestamp ordering.
    Sharding

    Directory-Based Sharding

    A lookup service (directory) maps each key to its shard. Maximum flexibility — can rebalance or migrate shards without rehashing. The directory itself is a bottleneck; cache it aggressively.

    • Supports arbitrary partition reassignment
    • Directory must be highly available (replicated)
    • Used by MongoDB (config servers), Vitess
    Async & Queue-Based Decoupling
    📬

    Message Queue Patterns

    Decouple producers from consumers — absorb traffic spikes, enable independent scaling
    Competing Consumers

    Multiple workers pull from the same queue. Each message processed once. Scale workers by queue depth. Used in Celery, SQS, RabbitMQ work queues.

    Pub/Sub Fan-Out

    Each subscriber receives every message. Topics → subscriptions. Used for event broadcasting, cache invalidation, audit logging.

    Event Streaming

    Ordered, durable log of events (Kafka, Kinesis). Consumers maintain offsets. Replay, backfill, and multiple independent consumers from same log.

    // Kafka consumer group scaling partitions = 12 // max parallel consumers consumers = 4 // each handles 3 partitions // Scale consumers up to partition count // Beyond that → pre-increase partition count
    Auto-Scaling Strategies
    Scaling

    Reactive Scaling

    Scale based on observed metrics: CPU utilization, memory, request queue depth, custom metrics.

    # K8s HPA — scale on CPU spec: scaleTargetRef: name: api-deployment minReplicas: 2 maxReplicas: 50 metrics: - type: Resource resource: name: cpu target: 60% utilization - type: External # queue depth external: metric: sqs_queue_length target: 100
    Scaling

    Predictive Scaling

    Pre-scale based on historical patterns (time-of-day, day-of-week), ML forecasts, or scheduled events (sale launches).

    • AWS Predictive Scaling uses ML on CloudWatch history
    • Pre-warm instance pools before expected traffic spikes
    • Combine with reactive for best of both worlds
    • Cold start mitigation: keep min instances warm
    🏗️

    Big Data Architecture

    Architectures for processing, storing, and serving data at petabyte scale — from ingestion pipelines to analytical query engines.

    Lambda vs Kappa Architecture
    Architecture

    Lambda Architecture

    Combines a batch layer (accurate, high-latency), speed layer (approximate, low-latency), and a serving layer that merges both views.

    Raw Data Store (immutable)
    Batch Layer
    Spark / Hadoop
    Hours/days latency
    Speed Layer
    Flink / Storm
    Seconds latency
    Serving Layer (merge views)
    Druid / Cassandra
    Con: Two code paths (batch + streaming) are complex to maintain and keep in sync.
    Architecture

    Kappa Architecture

    Simplifies Lambda by using only a streaming layer (Kafka + Flink). Reprocessing is done by replaying the Kafka log with a new consumer group.

    Immutable Event Log (Kafka)
    ↓ stream processing (Flink / Spark Structured Streaming)
    Real-time View
    Historical View
    (reprocessed)
    Pro: One code path, simpler ops. Con: Requires Kafka retention to be long enough for full reprocessing.
    Modern Data Lakehouse Architecture
    🗄️

    Data Lakehouse

    Combines the cheap storage of a data lake with the ACID guarantees and performance of a data warehouse
    Ingestion
    Kafka Streams
    CDC (Debezium)
    Batch ETL
    API Connectors
    Storage (S3 / GCS / ADLS)
    Bronze
    Raw / landing
    Silver
    Cleaned / dedupe
    Gold
    Aggregated / enriched
    Table Format (ACID layer)
    Delta Lake
    Apache Iceberg
    Apache Hudi
    Query Engines
    Spark SQL
    Trino / Presto
    Databricks SQL
    BigQuery Omni

    Delta Lake / Iceberg ACID Guarantees:

    • Serializable ACID transactions on object storage
    • Schema evolution without rewriting all data
    • Time travel: query data as of any past timestamp
    • UPSERT / MERGE / DELETE via snapshot isolation
    -- Iceberg time travel SELECT * FROM orders FOR VERSION AS OF 12345; SELECT * FROM orders FOR TIMESTAMP AS OF TIMESTAMP '2024-01-15 10:00:00'; -- Incremental read (CDC) SELECT * FROM iceberg.orders CHANGES BETWEEN SNAPSHOT 12340 AND 12345;
    Stream Processing Patterns
    Streaming

    Windowing Strategies

    Group stream events into finite windows for aggregation. Choice of window type directly affects accuracy and latency.

    Window TypeDescriptionUse Case
    TumblingFixed, non-overlapping intervalsHourly metrics
    SlidingFixed size, rolling every TMoving averages
    SessionGap-based, user activity scopedUser sessions
    GlobalAll events since beginningRunning totals
    Streaming

    Watermarks & Late Data

    In event-time processing, watermarks tell the engine "we won't see events older than T anymore." They allow emitting window results while handling some late-arriving data.

    // Flink watermark strategy WatermarkStrategy .forBoundedOutOfOrderness( Duration.ofSeconds(10)) // 10s late .withTimestampAssigner( (event, ts) -> event.eventTime) // Handle very late events .sideOutputLateData(lateTag) .allowedLateness(Duration.ofMinutes(1))
    Streaming

    Exactly-Once Semantics

    Guaranteeing that each event is processed exactly once — not lost (at-least-once) and not duplicated (at-most-once).

    • Flink: Distributed snapshots (Chandy-Lamport) + transactional sinks
    • Kafka: Idempotent producers + transactions (atomic write across partitions + offsets)
    • Spark: Structured Streaming write-ahead log + idempotent sinks

    Distributed Systems Design

    Fundamental theory and practical patterns for building correct, performant distributed systems — from CAP theorem trade-offs to service mesh observability.

    CAP Theorem & PACELC
    CAP

    Consistency

    Every read receives the most recent write or an error. All nodes see the same data at the same time. Requires coordination on writes.

    HBase
    Zookeeper
    etcd
    CockroachDB
    CAP

    Availability

    Every request gets a response (not necessarily the latest data). System remains operational even during node failures. No single point of failure.

    Cassandra
    CouchDB
    DynamoDB*
    Riak
    CAP

    Partition Tolerance

    System continues operating despite network partitions. In distributed systems, network partitions are inevitable — so you must choose between C and A when a partition occurs.

    PACELC: extends CAP — even without partitions, there is a trade-off between latency and consistency.
    Consistency Models (Spectrum)
    ModelGuaranteeLatencyExamples
    Linearizable Real-time ordering — strongest. Reads always see latest write globally. High (synchronous coordination) Zookeeper, etcd, Spanner
    Sequential Operations appear in some sequential order consistent with program order per process. Moderate Single-leader MySQL
    Causal Causally related operations are seen in order. Concurrent ops can differ across nodes. Low-Moderate MongoDB causal sessions, COPS
    Read-Your-Writes A process always sees its own writes. Other processes may still be stale. Low MySQL GTID sticky routing
    Eventual If no new updates, all replicas eventually converge. No timing guarantee. Very Low Cassandra, DynamoDB (default)
    Distributed Coordination Primitives
    Coordination

    Distributed Locking

    Mutual exclusion across nodes. Use fencing tokens to prevent the "zombie write" problem where a lock holder that paused for GC believes it still holds the lock.

    -- Redis Redlock algorithm 1. Get current timestamp T₀ 2. Try SET lock:key random_val NX PX {ttl} on N/2+1 nodes 3. If acquired majority AND (T_now - T₀) < TTL → lock held 4. On use: include fencing token (monotonic counter from lock svc) 5. Storage checks token is latest
    Coordination

    Leader Election

    One node is designated leader to coordinate work. Raft, ZAB (ZooKeeper), and bully algorithm are common approaches.

    • Epoch numbers prevent split-brain (old leader must respect new epoch)
    • Lease-based leadership: leader holds time-bounded lease from majority
    • Watch mechanism (ZK) for followers to detect leader failure instantly
    Coordination

    Distributed Barrier & Semaphore

    Coordinate a group of processes to a synchronization point. All workers wait at a barrier until all have reached it (e.g., all map tasks complete before reduce).

    // ZooKeeper double barrier enter(barrierNode, processId): create(barrierNode/processId) while childCount(barrierNode) < N: wait(watch=barrierNode) leave(barrierNode, processId): delete(barrierNode/processId) while childCount(barrierNode) > 0: wait(watch=barrierNode)
    Observability in Distributed Systems
    🔭

    Three Pillars of Observability

    Metrics, Logs, and Distributed Traces — each answers different questions about system behaviour
    📊 Metrics

    What: Numeric time-series (counters, gauges, histograms). Tools: Prometheus + Grafana. Patterns: RED (Rate, Errors, Duration) for services; USE (Utilization, Saturation, Errors) for resources.

    # Histogram for p99 http_request_duration_seconds_bucket{ le="0.1"} = 9500 le="1.0" = 9950 le="+Inf" = 10000
    📜 Logs

    What: Structured event records with context. Tools: ELK stack, Loki + Grafana. Patterns: Structured JSON logging, correlation IDs, log levels with sampling.

    { "level": "error", "trace_id": "abc-123", "service": "payment", "user_id": 42, "msg": "charge failed" }
    🔍 Traces

    What: End-to-end request flows across services via spans with parent-child relationships. Tools: Jaeger, Zipkin, Tempo. Standard: OpenTelemetry.

    span = tracer.start( "db.query", parent=httpSpan) span.setTag("db.table", "orders") // propagated via headers: // traceparent: 00-abc123-xyz789-01
    Service Mesh & Network Resilience
    Infra

    Service Mesh (Istio / Linkerd)

    Sidecar proxies intercept all inter-service traffic — providing resilience, observability, and security without code changes.

    • mTLS: Automatic mutual TLS between services
    • Traffic management: Retries, timeouts, circuit breaking at proxy layer
    • Canary: Traffic splitting by weight or header
    • Telemetry: Automatic metrics and trace injection
    Infra

    Rate Limiting & Backpressure

    Protect services from overload. Rate limiting at the edge; backpressure signals propagated through the system.

    // Token bucket algorithm class TokenBucket { capacity = 1000; // burst rate = 100/s; // refill rate tokens = capacity; allow(cost=1): refill() if tokens >= cost: tokens -= cost return true return false // 429 Too Many Requests }
    CAP / Consistency Cheatsheet for Common Systems
    System Type Default Consistency Replication HA Strategy
    PostgreSQL RDBMS Serializable (single node) Streaming replication (sync/async) Patroni, pgBouncer, Citus
    Cassandra Wide-column Eventual (tunable) Leaderless, N=3 typical Multi-DC, RF≥3 per DC
    Kafka Log / Queue Sequential per partition ISR (In-Sync Replicas) min.insync.replicas=2, acks=all
    Redis In-memory Strong (single node) Async to replicas Sentinel / Cluster mode
    Elasticsearch Search Near real-time (1s refresh) Primary + replica shards Cross-cluster replication (CCR)
    Apache Flink Streaming Exactly-once (checkpoints) State backend replication Standby TaskManagers, JobManager HA via ZK