BT
Privacy ToolboxJournalProjectsResumeBookmarks
Feed
Privacy Toolbox
Journal
Projects
Resume
Bookmarks
Intel
CIPHER
Threat Actors
Privacy Threats
Dashboard
CVEs
Tags
Intel
CIPHERThreat ActorsPrivacy ThreatsDashboardCVEsTags

Intel

  • Feed
  • Threat Actors
  • Privacy Threats
  • Dashboard
  • Privacy Toolbox
  • CVEs

Personal

  • Journal
  • Projects

Resources

  • Subscribe
  • Bookmarks
  • Developers
  • Tags
Cybersecurity News & Analysis
github
defconxt
•
© 2026
•
blacktemple.net
  • Security Patterns
  • Threat Modeling
  • Infrastructure
  • Network Segmentation
  • Identity & Auth
  • Cryptography & PKI
  • Data Protection
  • Supply Chain
  • DNS & Email
  • Containers & K8s
  • AWS Security
  • Azure Security
  • GCP Security
  • Cloud Infrastructure
  • Startup Security
  • Security Patterns
  • Threat Modeling
  • Infrastructure
  • Network Segmentation
  • Identity & Auth
  • Cryptography & PKI
  • Data Protection
  • Supply Chain
  • DNS & Email
  • Containers & K8s
  • AWS Security
  • Azure Security
  • GCP Security
  • Cloud Infrastructure
  • Startup Security
  1. CIPHER
  2. /Architecture
  3. /Security Architecture Design Patterns — Deep Dive

Security Architecture Design Patterns — Deep Dive

Security Architecture Design Patterns — Deep Dive

Principal-level reference for defense-in-depth, zero trust, microservices security, database hardening, browser security policies, protocol-specific controls, rate limiting, circuit breakers, and multi-tenant isolation.

Sources: OWASP Cheat Sheet Series (Microservices, Database, Docker, Kubernetes, SSRF, DoS, CSP, GraphQL, WebSocket, XML, CSS, Race Conditions), Google Cloud Security Foundations Blueprint.


Table of Contents

  1. Defense-in-Depth Patterns
  2. Zero Trust Network Architecture
  3. Microservices Security Architecture
  4. Service Mesh and mTLS
  5. API Gateway Security
  6. Database Security Architecture
  7. Container Security (Docker)
  8. Kubernetes Security Architecture
  9. CSP, CORS, and Same-Origin Policy
  10. WebSocket Security
  11. GraphQL Security Controls
  12. XML and Serialization Security
  13. SSRF Prevention Architecture
  14. Rate Limiting Architecture
  15. Circuit Breaker and Resilience Patterns
  16. Race Condition Defense
  17. Secure Multi-Tenant Design
  18. CSS Security
  19. Architectural Decision Framework

1. Defense-in-Depth Patterns

Defense-in-depth layers independent security controls so that failure of any single layer does not compromise the system. Each layer operates under the assumption that all other layers have already been breached.

The Layered Model

┌─────────────────────────────────────────────┐
│  LAYER 7: DATA        encryption at rest,   │
│                       field-level encryption,│
│                       tokenization, masking  │
├─────────────────────────────────────────────┤
│  LAYER 6: APPLICATION input validation, CSP,│
│                       auth/authz, WAF       │
├─────────────────────────────────────────────┤
│  LAYER 5: RUNTIME     containers, seccomp,  │
│                       AppArmor, sandboxing   │
├─────────────────────────────────────────────┤
│  LAYER 4: HOST        OS hardening, patching,│
│                       EDR, auditd           │
├─────────────────────────────────────────────┤
│  LAYER 3: NETWORK     segmentation, NACLs,  │
│                       IDS/IPS, mTLS         │
├─────────────────────────────────────────────┤
│  LAYER 2: IDENTITY    MFA, SSO, RBAC,       │
│                       least privilege        │
├─────────────────────────────────────────────┤
│  LAYER 1: PHYSICAL    datacenter security,   │
│                       HSMs, secure boot      │
└─────────────────────────────────────────────┘

Three Control Types (Google Cloud Blueprint Model)

  1. Policy Controls — Programmatic constraints that enforce acceptable resource configurations. Prevent risky setups through infrastructure-as-code validation and organization policy constraints before deployment.
  2. Architecture Controls — Resource configuration based on security best practices: network topology, resource hierarchy, blast radius containment.
  3. Detective Controls — Anomaly detection, log aggregation, threat detection services, SIEM integration, custom enforcement.

Principles

  • Assume breach: design every layer as if the attacker already has a foothold in the adjacent layer.
  • Independent failure domains: a control at layer N must not depend on layer N-1 being intact.
  • Validation ordering: perform cheap validations (format, size, type) before expensive ones (database lookups, crypto operations).
  • No security theater: every control must measurably reduce risk. Call out controls that create illusion without substance.

2. Zero Trust Network Architecture

Zero trust eliminates implicit trust based on network location. Every request is authenticated, authorized, and encrypted regardless of origin.

Core Tenets

Principle Implementation
Never trust, always verify Every service-to-service call carries verifiable identity
Least privilege access RBAC/ABAC with just-in-time elevation, time-bounded tokens
Assume breach Microsegmentation limits blast radius; east-west traffic encrypted
Verify explicitly Context-aware access: identity + device + location + behavior
Continuous validation Session re-validation at intervals; token refresh with short TTL

Network Architecture Pattern

                        ┌─────────────┐
  User ──── Identity ───┤  Policy     │
  Device    Provider    │  Decision   │
  Context               │  Point      │
                        └──────┬──────┘
                               │ allow/deny
                        ┌──────▼──────┐
                        │  Policy     │
                        │  Enforcement│
                        │  Point      │
                        └──────┬──────┘
                               │ mTLS
                    ┌──────────┼──────────┐
                    ▼          ▼          ▼
               ┌────────┐ ┌────────┐ ┌────────┐
               │Service │ │Service │ │Service │
               │   A    │ │   B    │ │   C    │
               └────────┘ └────────┘ └────────┘

Google Cloud Blueprint Implementation

  • No public internet access by default: no outbound or inbound traffic to/from public internet permitted unless explicitly allowed.
  • Shared VPC: centralized network resource management across regions/zones with environment separation by network topology.
  • Private paths enforced: all on-premises and cloud resource communication over private interconnects.
  • GitOps model: all infrastructure changes through version-controlled, reviewed Terraform with policy-as-code validation in CI/CD pipeline before deployment.

Microsegmentation

  • Segment by workload, not by network subnet. Each workload gets its own identity.
  • Network policies (K8s) or security groups (cloud) restrict east-west traffic to explicit allow rules.
  • Default-deny posture: all traffic blocked unless a policy explicitly permits it.

3. Microservices Security Architecture

Authorization Layers

Authorization enforcement must occur at three independent layers:

  1. Gateway/Proxy — Coarse-grained, cross-cutting decisions (authentication, basic role checks, rate limiting).
  2. Microservice Layer — Shared libraries or sidecar proxies for fine-grained policy enforcement. Centralized policy with embedded Policy Decision Point (PDP) is recommended.
  3. Business Logic — Service-specific authorization that understands domain context.

Authorization Patterns

Pattern Description Trade-offs
Decentralized Policy embedded in service code Independent but inconsistent; requires code changes for policy updates
Centralized Single PDP Remote policy service evaluates all requests Consistent but introduces latency and availability risk
Centralized Embedded PDP Policy defined centrally, deployed as library/sidecar Best of both: consistent policy, low latency, no external dependency at runtime

Netflix pattern: Policy Portal (authoring) -> Repository (storage) -> Aggregator (compilation) -> Distributor (deployment to sidecars).

Identity Propagation

Recommended: Trusted Issuer-Signed Structures

Edge services authenticate external tokens (OAuth2, OIDC), then mint internally-signed identity structures (e.g., Netflix "Passport"). This approach:

  • Decouples external tokens from internal representations
  • Uses single, extensible data structures
  • Never exposes internal structures externally
  • Is external access token agnostic

Anti-pattern: Passing raw external tokens between internal services. This creates tight coupling and risks privilege escalation through token manipulation.

Security Architecture Documentation

For each microservice, document:

  • Unique service name/ID, business process, API definitions with security schemes
  • Service-to-storage access types (read/write)
  • Synchronous service-to-service calls (protocol, data exchanged)
  • Asynchronous communications (publisher/subscriber via message queues)
  • Data asset classification (PII, confidential, public)
  • Trust boundary justifications

Logging Architecture

Service stdout/stderr ──► Local File ──► Logging Agent ──► Message Broker ──► Central Logging
                              │                │                  │
                              │                │                  ├─ Mutual TLS
                              │                ├─ Data sanitization│
                              │                │  (strip PII,     ├─ Least-privilege
                              │                │   passwords,     │   access policies
                              ├─ Prevents      │   API keys)     │
                              │  data loss     │                  │
                              │  on failure    ├─ Asynchronous    │
                                               │  (prevents DoS   │
                                               │   of log system) │

Requirements:

  • Correlation IDs for cross-service call tracing
  • Structured format (JSON) with contextual metadata (hostname, container, class)
  • Sanitization: never send PII, passwords, or API keys to central logging

4. Service Mesh and mTLS

Mutual TLS (mTLS)

Each microservice uses public/private key pairs for bidirectional authentication, providing:

  • Confidentiality: encrypted channel between services
  • Integrity: tamper detection
  • Authentication: cryptographic identity verification

Operational challenges:

  • Key provisioning and trust bootstrap (initial certificate distribution)
  • Certificate revocation (CRL/OCSP infrastructure)
  • Key rotation (automated renewal before expiry)
  • Certificate authority management (dedicated internal CA)

Service Mesh Benefits

Capability Security Value
Automatic mTLS Encryption and authentication without application code changes
Telemetry/tracing Generates security-relevant metrics and distributed traces
Ingress/egress control Traffic monitoring and policy enforcement at mesh boundary
Fine-grained RBAC Service-level access control via mesh policies
Traffic shaping Rate limiting, circuit breaking, retries with backoff

Service Mesh Trade-offs

  • Increases architectural complexity
  • Requires expertise in both K8s and mesh technology (Istio, Linkerd, Consul Connect)
  • Performance impact from sidecar proxy overhead (typically 1-3ms latency per hop)
  • Debugging becomes harder with proxy-mediated traffic

Token-Based Service Authentication (Alternative to mTLS)

Mode Use Case Trade-off
Online validation Centralized token service validates each request Detects revoked tokens immediately; higher latency
Offline validation Services validate using downloaded public keys (JWKS) Lower latency; cannot detect revoked tokens in real-time

5. API Gateway Security

Gateway as Security Perimeter

The API gateway centralizes:

  • Authentication (OAuth2/OIDC token validation)
  • Coarse-grained authorization (role/scope checks)
  • Rate limiting and throttling
  • Request/response transformation and validation
  • TLS termination
  • Logging and correlation ID injection

Gateway Limitations

  • Single point of decision: violates defense-in-depth if relied upon exclusively.
  • Scalability constraint: complex ecosystems with numerous roles become difficult to manage at the edge alone.
  • Operational bottleneck: development teams cannot independently modify authorization rules.

Mitigation Pattern

  • Implement mutual authentication to prevent gateway bypass and direct internal service access.
  • Layer authorization at gateway AND service AND business logic levels.
  • Use the gateway for cross-cutting concerns only; push domain-specific authorization to services.

6. Database Security Architecture

Network Isolation

┌──────────────────────────────────┐
│  DMZ / Application Tier          │
│  ┌──────────┐  ┌──────────┐     │
│  │  App A   │  │  App B   │     │
│  └────┬─────┘  └────┬─────┘     │
│       │              │           │
├───────┼──────────────┼───────────┤  ◄── Firewall
│  Database Tier                   │
│  ┌──────────┐  ┌──────────┐     │
│  │  DB A    │  │  DB B    │     │
│  └──────────┘  └──────────┘     │
└──────────────────────────────────┘
  • Disable TCP access where possible; require local socket or named pipe.
  • If TCP needed, bind to localhost or restrict via firewall to specific application hosts only.
  • Database servers in separate network segments from application tier.
  • Web-based management tools (phpMyAdmin, pgAdmin) require authentication, HTTPS, and network-level access controls.

Authentication and Access Control

  • Mandatory authentication for all connections, including local access.
  • Strong, unique passwords per database account.
  • Single-application or service-specific accounts (never shared credentials).
  • Never use default accounts (root, sa, SYS, SYSTEM) for application access.
  • No administrative rights for application accounts.
  • Host-based connection restrictions (connect only from designated app servers).
  • Environment-specific databases and accounts (dev, staging, prod never share credentials).

Least Privilege in Practice

Permission Level Pattern
Minimal SELECT, UPDATE, DELETE only (no DDL)
Table-level Grant access to specific tables only
Column-level Restrict sensitive columns (SSN, credit card)
Row-level Row-level security policies filter by tenant/role
View-based Access through restricted views rather than base tables
No DB links Avoid database links unless absolutely necessary

Credential Management

  • Credentials stored outside web root in configuration files with restricted file permissions.
  • Excluded from source code repositories.
  • Encrypted using platform features (ASP.NET protected configuration, Vault, AWS Secrets Manager).
  • Regular credential rotation; immediate rotation on staff changes.

Transport Security

  • Enforce encrypted connections exclusively (reject plaintext).
  • Deploy trusted certificates on database servers.
  • Require TLSv1.2+ with modern ciphers (AES-GCM, ChaCha20).
  • Client-side certificate validation.

Hardening Checklist

  • Apply security patches promptly.
  • Run database service under low-privileged OS account.
  • Remove default accounts and sample databases.
  • Transaction logs on separate storage from data files.
  • Regular encrypted backups with restricted access permissions.
  • SQL Server: disable xp_cmdshell, xp_dirtree, CLR execution, SQL Browser, Mixed Mode Auth.
  • MySQL/MariaDB: run mysql_secure_installation; disable FILE privilege.

7. Container Security (Docker)

Defense-in-Depth Stack for Containers

Layer 1: Image Security
  ├── Pin specific versions (no floating tags)
  ├── Minimal base images (distroless, scratch)
  ├── CI/CD image scanning (Trivy, Snyk, Docker Scout)
  ├── SBOM generation
  ├── Image signing (Notary/Cosign)
  └── Private registries with access controls

Layer 2: Runtime Isolation
  ├── Non-root user (USER directive, runAsUser)
  ├── no-new-privileges flag
  ├── Drop all capabilities, add only needed
  ├── Never use --privileged
  └── Read-only root filesystem + tmpfs for temp

Layer 3: Kernel Security
  ├── Seccomp profiles (start from Docker default, customize)
  ├── AppArmor/SELinux mandatory access control
  └── Behavioral monitoring (Falco, Tetragon, Cilium eBPF)

Layer 4: Network Security
  ├── Custom Docker networks (explicit connectivity)
  ├── K8s NetworkPolicies for east-west traffic
  └── No exposed daemon sockets

Layer 5: Resource Limits
  ├── Memory limits (-m 512m)
  ├── CPU limits (--cpus="0.5")
  ├── File descriptor limits (--ulimit nofile=1024)
  ├── Process limits (--ulimit nproc=256)
  └── Restart policy (--restart=on-failure:3)

Layer 6: Secrets Management
  ├── Docker Secrets (Swarm) or external vault
  ├── Never bake secrets into images
  └── K8s: enable etcd encryption or use external KMS

Critical Anti-Patterns

  • Never expose /var/run/docker.sock to containers (container escape vector).
  • Never use TCP daemon socket without TLS mutual authentication.
  • Never use --privileged (grants all kernel capabilities).
  • Never use floating image tags in production (supply chain risk).
  • Never store secrets in environment variables in K8s (visible via API, logged in crash dumps).

Rootless Mode

Docker daemon and containers run as unprivileged user. If container escape occurs, attacker lands as unprivileged host user. Different from userns-remap (which remaps UIDs while daemon runs as root).

Alternative: Podman

Daemonless architecture using fork-exec model eliminates central daemon as single point of compromise. Native rootless support and SELinux integration provide OCI-compliant security defaults.


8. Kubernetes Security Architecture

Multi-Layer Security Model

┌─────────────────────────────────────────────┐
│  CLUSTER LEVEL                              │
│  ├── API Server hardening (OIDC, no static  │
│  │   tokens, Node+RBAC authorization)       │
│  ├── etcd encryption + mTLS + isolation     │
│  ├── Admission controllers (PSA, OPA,       │
│  │   Kyverno, ImagePolicyWebhook)           │
│  └── Audit logging (Metadata/Request level) │
├─────────────────────────────────────────────┤
│  NAMESPACE LEVEL                            │
│  ├── RBAC (deny-by-default, minimal verbs)  │
│  ├── Resource quotas (CPU, memory, pods)    │
│  ├── NetworkPolicies (default-deny ingress  │
│  │   and egress per namespace)              │
│  └── Pod Security Standards (restricted)    │
├─────────────────────────────────────────────┤
│  POD LEVEL                                  │
│  ├── SecurityContext (runAsNonRoot,         │
│  │   readOnlyRootFilesystem,               │
│  │   allowPrivilegeEscalation: false)       │
│  ├── Capability dropping (drop ALL)         │
│  ├── Service account with minimal RBAC      │
│  └── Image from signed, scanned registry    │
├─────────────────────────────────────────────┤
│  RUNTIME LEVEL                              │
│  ├── Falco/Tetragon behavioral monitoring   │
│  ├── Container sandboxing (gVisor, Kata)    │
│  └── Continuous vulnerability scanning      │
└─────────────────────────────────────────────┘

Pod Security Standards

Level Posture Use Case
Privileged Unrestricted System workloads (CNI, storage drivers) only
Baseline Prevents known privilege escalations General workloads
Restricted Maximum hardening Sensitive workloads, multi-tenant

Applied via namespace labels: pod-security.kubernetes.io/enforce: restricted Three modes: enforce (blocks), audit (logs), warn (alerts).

etcd Security

etcd stores all cluster state and secrets. Write access to etcd = root on entire cluster.

  • mTLS between API servers and etcd (dedicated CA).
  • Firewall isolation: only API servers can reach etcd.
  • etcd ACLs to limit keyspace access per component.
  • Consider separate etcd instances for different components.

API Server Authentication

Recommended: OIDC for short-lived tokens and centralized group management, or managed provider IAM (GKE, EKS, AKS).

Avoid: Static token files (no rotation), X509 client certs (no revocation), service account tokens for user auth (cluster-scoped, no expiry by default).

Container Sandboxing

For untrusted workloads, add isolation beyond Linux namespaces:

Technology Mechanism Overhead
gVisor User-space kernel in Go, ~70% syscall coverage, uses ~20 host syscalls Low-moderate
Kata Containers Stripped-down VM per pod Moderate
Firecracker Micro-VM with seccomp + cgroup + namespace Low

Kubelet Security

Kubelets expose HTTPS endpoints with powerful node/container control:

  • Enable authentication and authorization (disable anonymous access).
  • Restrict API access to trusted networks.
  • Monitor port 10250 (Kubelet API) for unauthorized access attempts.

9. CSP, CORS, and Same-Origin Policy

Same-Origin Policy (SOP)

The browser's foundational security boundary. Two URLs have the same origin if protocol, host, and port all match. SOP prevents scripts from one origin reading responses from another origin.

URL A URL B Same Origin?
https://a.com/page https://a.com/other Yes
https://a.com http://a.com No (protocol)
https://a.com https://a.com:8443 No (port)
https://a.com https://b.a.com No (host)

Content Security Policy (CSP)

CSP is a defense-in-depth layer against XSS. It does not replace secure coding; it mitigates exploitation when output encoding fails.

Directive Categories

Category Directives Purpose
Fetch script-src, style-src, img-src, connect-src, font-src, object-src, default-src Control resource loading origins
Document base-uri, sandbox, plugin-types Restrict document properties
Navigation form-action, frame-ancestors Restrict navigation and framing
Reporting report-to, report-uri Violation reporting

Strict CSP (Recommended)

Nonce-based (preferred for server-rendered):

Content-Security-Policy:
  script-src 'nonce-{RANDOM}' 'strict-dynamic';
  object-src 'none';
  base-uri 'none';

Hash-based (for static pages):

Content-Security-Policy:
  script-src 'sha256-{HASH}' 'strict-dynamic';
  object-src 'none';
  base-uri 'none';

Key rules:

  • Generate unique nonce per HTTP response (cryptographically random).
  • Never create middleware that auto-injects nonces into all script tags (attacker-injected scripts would get nonces too).
  • strict-dynamic allows dynamically-created scripts from trusted scripts, reducing annotation burden.
  • object-src 'none' blocks plugin-based XSS vectors (Flash, Java).
  • base-uri 'none' prevents base tag injection for relative URL hijacking.

Deployment Strategy

  1. Deploy in Content-Security-Policy-Report-Only mode first.
  2. Monitor violation reports via report-to endpoint.
  3. Refactor inline scripts to external files or add nonces.
  4. Convert inline event handlers to addEventListener.
  5. Switch to enforcing mode.

Additional CSP Protections

  • frame-ancestors 'none' — prevents clickjacking (supersedes X-Frame-Options).
  • upgrade-insecure-requests — forces HTTPS for mixed content.
  • form-action 'self' — prevents form hijacking to external endpoints.

CORS Security

CORS relaxes SOP in a controlled manner. Misconfigurations create same-origin-equivalent access for attackers.

Critical CORS Rules

  1. Never reflect the Origin header as Access-Control-Allow-Origin — this is equivalent to a wildcard with credentials.
  2. Never use wildcard * with credentials — browsers reject Access-Control-Allow-Origin: * when Access-Control-Allow-Credentials: true.
  3. Validate Origin against a strict allowlist — exact string match, not substring or regex that can be bypassed.
  4. Minimize exposed headers — only expose headers the client genuinely needs.
  5. Set Vary: Origin — prevents cache poisoning when responses differ by origin.

Common CORS Misconfigurations

Misconfiguration Risk
Reflecting Origin header verbatim Any site can read authenticated responses
Origin: null in allowlist Sandboxed iframes and data: URIs get access
Substring matching (e.g., endsWith('.example.com')) attacker-example.com bypasses
Regex without anchoring example.com.evil.com bypasses
Wildcard with credentials Browser blocks but indicates design flaw

Secure CORS Pattern

ALLOWED_ORIGINS = {"https://app.example.com", "https://admin.example.com"}

def cors_middleware(request, response):
    origin = request.headers.get("Origin")
    if origin in ALLOWED_ORIGINS:
        response.headers["Access-Control-Allow-Origin"] = origin
        response.headers["Vary"] = "Origin"
        response.headers["Access-Control-Allow-Credentials"] = "true"
        response.headers["Access-Control-Allow-Methods"] = "GET, POST"
        response.headers["Access-Control-Allow-Headers"] = "Content-Type, Authorization"
        response.headers["Access-Control-Max-Age"] = "7200"
    # If origin not in allowlist: no CORS headers = browser blocks

10. WebSocket Security

Transport Security

  • Always use wss:// in production. Never use unencrypted ws://.
  • Support only RFC 6455. Disable legacy protocol versions (Hixie-76, hybi-00) with known vulnerabilities.
  • Disable permessage-deflate by default to prevent CRIME/BREACH-style compression side-channel attacks.

Cross-Site WebSocket Hijacking (CSWSH) Prevention

Browsers automatically include session cookies in WebSocket handshakes, enabling attackers on malicious sites to hijack authenticated connections.

Defenses:

  1. Validate Origin header on every handshake against explicit allowlist (never blacklist, never wildcard).
  2. Apply SameSite=Lax or SameSite=Strict cookies.
  3. Use token-based authentication (query string or post-connection message) instead of relying solely on cookies.
  4. Rotate tokens in long-lived connections to prevent hijacked session persistence.

Message-Level Security

  • Treat all WebSocket messages as untrusted input.
  • JSON schema validation with allowlists for message types/fields.
  • Binary file type verification via magic numbers (not headers).
  • Message size limits (typically 64KB maximum).
  • Nonce/timestamp inclusion to prevent replay attacks.
  • Use JSON.parse(), never eval().

Per-Action Authorization

Connection establishment does not grant blanket access. Validate user roles and permissions before processing each message/action independently.

DoS Mitigation

Control Recommended Baseline
Per-user connection limit 5-10 concurrent connections
Message rate limit 100 messages/minute
Max payload size 64KB (configurable per use case)
Idle timeout Close inactive connections
Backpressure Flow control preventing unbounded buffering
Heartbeat Ping/pong frames detecting and cleaning dead connections

Logging

Capture: connection/termination events with user identity and origin, auth outcomes, authz failures, protocol violations. Exclude: tokens, session IDs, message payloads containing sensitive data.


11. GraphQL Security Controls

GraphQL's flexibility creates unique attack surface compared to REST.

Query Abuse Prevention

Control Purpose Tools
Depth limiting Prevent deeply nested queries causing recursive resolution graphql-depth-limit (JS), MaxQueryDepthInstrumentation (Java)
Complexity analysis Assign cost to field resolution, reject expensive queries graphql-cost-analysis (JS), Apollo complexity plugins
Timeout per resolver Prevent individual resolvers from hanging 10-second default
Pagination enforcement Prevent unbounded list queries Require first/last arguments

Schema Exposure Controls

  • Disable introspection in production to prevent schema reconnaissance.
  • Disable "Did you mean?" suggestions (leaks field names even with introspection off).
  • Field visibility middleware for role-based schema exposure (different roles see different schema subsets).

Authorization Architecture

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   Gateway    │────▶│   Resolver   │────▶│  Data Layer  │
│  Auth Check  │     │  RBAC Check  │     │  Row-Level   │
│  (identity)  │     │  (field-lvl) │     │  Security    │
└──────────────┘     └──────────────┘     └──────────────┘
  • Validate authorization on both graph edges AND nodes.
  • Implement checks within Query/Mutation resolvers using RBAC middleware.
  • Prevent IDOR by verifying caller permissions before data access (especially for direct ID-based lookups).
  • Use GraphQL Interfaces and Unions to return different object shapes based on requester privileges.

Batching Attack Mitigation

GraphQL allows multiple queries in a single HTTP request, bypassing per-request rate limits:

  1. Object-level rate limiting: track per-caller object requests across batches.
  2. Sensitive field protection: prevent batching for usernames, emails, OTPs, session tokens.
  3. Operation throttling: limit concurrent queries per request (e.g., max 5 operations per batch).

Persisted Queries

Pre-approve query strings at deployment time. Clients send query hash instead of arbitrary query text. Eliminates arbitrary query execution, batching abuse, and query injection risks.

Input Validation

  • Enforce allowlisting via GraphQL scalars, enums, and custom validators.
  • Define input schemas for all mutations.
  • Use parameterized queries/ORMs in resolvers (never string concatenation).
  • Disable dynamic resolver targeting to prevent SSRF/command injection.

12. XML and Serialization Security

XXE Prevention

XXE (XML External Entity) attacks exploit parser features to read files, perform SSRF, or cause DoS.

Universal defense: disable external entity processing entirely in parser configuration.

# Python (defusedxml)
import defusedxml.ElementTree as ET
tree = ET.parse(source)  # XXE-safe by default

# Java (DocumentBuilderFactory)
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", True)
factory.setFeature("http://xml.org/sax/features/external-general-entities", False)
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", False)

XML Bomb Prevention

Attack Mechanism Defense
Billion Laughs Exponential entity nesting (recursive references) Entity expansion limits, depth restrictions
Quadratic Blowup Large entity referenced repeatedly (O(n^2) expansion) Entity size limits
Recursive References Circular entity definitions Recursion depth limits

Schema Hardening

  • Use XML Schema (XSD), not DTD, for validation.
  • Set maxOccurs boundaries (never unbounded without testing).
  • Use precise types: positiveInteger not integer, decimal not float/double (prevents Infinity/NaN).
  • Apply maxLength, minLength, pattern restrictions on strings.
  • Enumerate allowed values where possible.

Schema Poisoning Defense

  • Embed schemas with integrity verification (don't fetch remotely at runtime).
  • Restrict file permissions on local schema/DTD files.
  • If remote schemas needed, use HTTPS only, maintain local copies, verify integrity.

General Serialization Security

  • Reject DTDs entirely (SOAP specification forbids them).
  • Validate document well-formedness before processing.
  • Set resource limits: document size, element count, nesting depth.
  • Avoid disclosing internal paths in error messages.

13. SSRF Prevention Architecture

Defense-in-Depth: Application + Network Layers

┌─────────────────────────────────────┐
│  APPLICATION LAYER                  │
│  ├── Input validation (reject URLs) │
│  ├── IP/domain allowlisting         │
│  ├── DNS rebinding prevention       │
│  └── URL scheme restriction         │
├─────────────────────────────────────┤
│  NETWORK LAYER                      │
│  ├── Firewall egress filtering      │
│  ├── Network segmentation           │
│  └── Cloud metadata protection      │
└─────────────────────────────────────┘

Application Layer Controls

Rule 1: Never accept complete URLs from users. URLs are difficult to validate and parsers can be abused. Accept only validated IP addresses or domain names.

IP validation:

  • Validate format using language-specific libraries (not regex).
  • Cross-reference against allowlist of trusted IPs (both IPv4 and IPv6).
  • Use validated library output as comparison baseline to prevent encoding bypasses.

Domain validation:

  • Validate format without performing DNS resolution.
  • Maintain allowlist of trusted domains.
  • Monitor DNS records to detect resolution to non-public IP ranges.

Deny-list minimums (when allowlisting not possible):

  • AWS IMDS: 169.254.169.254, fd00:ec2::254
  • Localhost: 127.0.0.0/8, ::1/128
  • RFC1918: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
  • Link-local: 169.254.0.0/16
  • Multicast: 224.0.0.0/4

Network Layer Controls

  • Restrict outbound application access via host-based or network firewalls to only legitimate routes.
  • Network compartmentalization to block illegitimate calls at infrastructure level.
  • Disable HTTP redirect following to prevent validation bypass.

Cloud Metadata Protection

Migrate from IMDSv1 to IMDSv2 (AWS) as defense-in-depth. IMDSv2 requires a session token obtained via PUT request, which SSRF attacks cannot easily replicate.

DNS Rebinding Prevention

  • Resolve domains against internal DNS resolvers only.
  • Retrieve all A and AAAA records, validate each IP against private ranges.
  • Monitor allowlisted domains for resolution changes to non-public addresses.
  • Pin DNS resolution results (use the resolved IP, not the hostname, for the actual request).

14. Rate Limiting Architecture

Multi-Layer Rate Limiting

┌──────────────────────────────────────────┐
│  EDGE (CDN/WAF/Load Balancer)            │
│  ├── Per-IP rate limits                  │
│  ├── Geographic filtering                │
│  ├── Volumetric DDoS mitigation          │
│  └── Connection rate limits              │
├──────────────────────────────────────────┤
│  API GATEWAY                             │
│  ├── Per-user/API-key rate limits        │
│  ├── Per-endpoint rate limits            │
│  ├── Request size limits                 │
│  └── Concurrent connection limits        │
├──────────────────────────────────────────┤
│  APPLICATION                             │
│  ├── Per-operation rate limits            │
│  ├── Resource-specific throttling        │
│  ├── Business logic rate limits          │
│  └── Object-level rate limits (GraphQL)  │
├──────────────────────────────────────────┤
│  DATABASE                                │
│  ├── Connection pool limits              │
│  ├── Query timeout limits                │
│  └── Transaction timeout limits          │
└──────────────────────────────────────────┘

Rate Limiting Algorithms

Algorithm Behavior Use Case
Token Bucket Steady refill rate, burst allowed up to bucket size API rate limiting (most common)
Leaky Bucket Fixed drain rate, excess dropped Smoothing bursty traffic
Fixed Window Counter resets at interval boundary Simple per-minute/hour limits
Sliding Window Log Precise per-request timestamp tracking High-accuracy rate limiting
Sliding Window Counter Weighted average of current and previous window Balance of accuracy and performance

Slow HTTP Attack Defense

  • Define minimum ingress data rate limit; drop connections below that rate (counters Slowloris, Slow POST).
  • Absolute connection timeouts (not just idle timeouts).
  • Maximum request header size and body size limits.
  • Total concurrent connection limits per client IP.

DoS Resilience Patterns

  • Validation ordering: perform cheap checks (format, size) before expensive ones (database, crypto).
  • Authentication gating: require authentication before allowing access to resource-intensive operations.
  • Graceful degradation: maintain reduced functionality rather than complete failure.
  • Static resource separation: host images, scripts, CSS on separate domains/CDN.
  • Caching: serve cached responses for repeat requests.
  • Asynchronous processing: use queues for CPU-intensive operations; return 202 Accepted.

15. Circuit Breaker and Resilience Patterns

Circuit Breaker Pattern

Prevents cascading failures when downstream services degrade.

            ┌──────────────┐
  Request ──┤   CLOSED     │──── Forward to service
            │  (normal)    │
            └──────┬───────┘
                   │ failure threshold exceeded
            ┌──────▼───────┐
            │    OPEN      │──── Return fallback/error immediately
            │  (tripped)   │     (no request forwarded)
            └──────┬───────┘
                   │ timeout expires
            ┌──────▼───────┐
            │  HALF-OPEN   │──── Forward limited probe requests
            │  (testing)   │     Success → CLOSED
            └──────────────┘     Failure → OPEN

Configuration Parameters

Parameter Description Typical Value
Failure threshold Errors before opening 5-10 failures in 60s
Timeout Time in OPEN before probing 30-60 seconds
Success threshold Successes in HALF-OPEN to close 3-5 consecutive
Monitoring window Rolling window for failure counting 60 seconds

Bulkhead Pattern

Isolate failures to prevent resource exhaustion across the entire system:

  • Separate thread pools per downstream dependency.
  • Separate connection pools per service.
  • Resource quotas per tenant/customer.
  • Namespace isolation in K8s (CPU/memory quotas per namespace).

Retry with Backoff

Attempt 1: immediate
Attempt 2: wait 1s + jitter
Attempt 3: wait 2s + jitter
Attempt 4: wait 4s + jitter
(cap at max backoff, e.g., 30s)
  • Always add random jitter to prevent thundering herd.
  • Set maximum retry count (3-5 typically).
  • Only retry on transient failures (5xx, timeouts), never on 4xx.
  • Combine with circuit breaker: when circuit opens, stop retrying.

Timeout Hierarchy

Client timeout > Gateway timeout > Service timeout > DB timeout
     30s              15s              10s             5s

Each layer's timeout must be shorter than its caller's to prevent zombie connections.


16. Race Condition Defense

TOCTOU (Time-of-Check-to-Time-of-Use)

The classic pattern: a resource is checked for a condition, then used based on that check, but the resource changes between check and use.

Thread A: check(balance >= 100)     → true
Thread B: check(balance >= 100)     → true
Thread A: debit(100)                → balance = 0
Thread B: debit(100)                → balance = -100  ← RACE

Defense Patterns

Pattern Mechanism Use Case
Pessimistic locking SELECT ... FOR UPDATE acquires row lock before read Financial transactions, inventory
Optimistic locking Version column; UPDATE ... WHERE version = N fails if concurrent modification Low-contention scenarios
Atomic operations UPDATE balance = balance - 100 WHERE balance >= 100 (check + modify in single statement) Simple counter/balance operations
Database constraints CHECK (balance >= 0) enforced at DB level Invariant enforcement
Idempotency keys Client-generated unique key per operation; server rejects duplicates Payment processing, API mutations
Serializable isolation SET TRANSACTION ISOLATION LEVEL SERIALIZABLE Highest consistency requirement
Mutex/advisory locks pg_advisory_lock(key) or application-level mutex Cross-table consistency

Idempotency Pattern

Client: POST /payments {idempotency_key: "abc-123", amount: 100}

Server:
  1. Check idempotency_key in store
  2. If exists: return cached response (no re-execution)
  3. If not: execute, store result keyed by idempotency_key, return
  • Keys should expire after reasonable window (24-48 hours).
  • Store both request hash and response to detect parameter tampering.
  • Use database unique constraint on idempotency key for atomicity.

Distributed Systems Considerations

  • Redis SETNX (SET if Not eXists) for distributed locks with TTL.
  • Redlock algorithm for fault-tolerant distributed locking across multiple Redis instances.
  • Database-level locking preferred over application-level when possible (closer to the data, harder to bypass).
  • Event sourcing: append-only log eliminates update races entirely.

17. Secure Multi-Tenant Design

Isolation Models

Strongest ──────────────────────────────────── Weakest
   │                                              │
   ▼                                              ▼
Separate     Separate      Shared Infra,     Shared
Infrastructure  Namespaces/   Separate DB/      Everything
(per tenant)    VPCs          Schema            (row-level)
Model Isolation Cost Complexity Use Case
Separate infrastructure Highest Highest Moderate Regulated industries, government
Separate namespaces/VPCs High High High Enterprise SaaS
Shared infra, separate DB/schema Medium Medium Medium Standard SaaS
Shared everything (row-level) Lowest Lowest Low Consumer apps, cost-sensitive

Kubernetes Multi-Tenant Pattern

┌─────────────────────────────────────────┐
│  Cluster                                │
│  ┌─────────────────────────────────────┐│
│  │  Namespace: tenant-a               ││
│  │  ├── ResourceQuota (4 pods, 2 CPU) ││
│  │  ├── NetworkPolicy (deny all       ││
│  │  │   cross-namespace)              ││
│  │  ├── Pod Security: restricted      ││
│  │  └── RBAC: tenant-a-role           ││
│  └─────────────────────────────────────┘│
│  ┌─────────────────────────────────────┐│
│  │  Namespace: tenant-b               ││
│  │  ├── ResourceQuota (4 pods, 2 CPU) ││
│  │  ├── NetworkPolicy (deny all       ││
│  │  │   cross-namespace)              ││
│  │  ├── Pod Security: restricted      ││
│  │  └── RBAC: tenant-b-role           ││
│  └─────────────────────────────────────┘│
└─────────────────────────────────────────┘

Cross-Cutting Multi-Tenant Controls

Layer Control Purpose
Identity Tenant context in JWT claims Every request carries tenant identity
API Gateway Tenant-aware rate limiting Per-tenant quotas prevent noisy neighbor
Application Tenant filter on all queries Prevent cross-tenant data access
Database Row-level security policies DB-enforced tenant isolation
Storage Tenant-prefixed object keys + IAM Prevent cross-tenant storage access
Encryption Per-tenant encryption keys Cryptographic isolation of data
Logging Tenant ID in all log entries Tenant-scoped audit trails
Network Namespace/VPC isolation Network-level blast radius containment

Noisy Neighbor Prevention

  • Per-tenant resource quotas (CPU, memory, storage, API calls).
  • Per-tenant connection pool limits to shared databases.
  • Per-tenant queue depth limits for async processing.
  • Circuit breakers per tenant: if one tenant causes excessive errors, isolate them without affecting others.
  • Fair scheduling: weighted round-robin or priority queues preventing any single tenant from monopolizing shared resources.

Data Isolation Verification

  • Automated tests that attempt cross-tenant data access (should fail).
  • SQL query audit: every query touching tenant data must include tenant filter (static analysis or query interceptor).
  • Penetration testing specifically targeting tenant boundary bypass (IDOR, parameter tampering, JWT manipulation).

18. CSS Security

Attack Surface

  • Reconnaissance via CSS selectors: descriptive class names (.addUser, .deleteUser, .adminPanel) reveal application features to unauthenticated attackers examining global CSS files.
  • CSS injection: if attacker-controlled content enters stylesheets, it can enable data exfiltration via background-image URLs, clickjacking via element repositioning, and UI redressing.
  • Third-party stylesheet risk: externally hosted CSS can be modified to inject malicious styles.

Defenses

  1. Role-based CSS isolation: segregate stylesheets by access level. Server-side access controls on CSS file delivery. Log suspicious CSS file access.
  2. CSS obfuscation: replace descriptive selectors with generated names using CSS Modules, JSS (minify option), or build-time obfuscation. Use framework classes (Bootstrap, Tailwind) to reduce custom selectors.
  3. CSP for styles: style-src 'self' or style-src 'nonce-{RANDOM}' to prevent inline style injection and restrict stylesheet sources.
  4. Subresource Integrity (SRI): <link rel="stylesheet" href="..." integrity="sha256-..." crossorigin="anonymous"> for third-party stylesheets.

19. Architectural Decision Framework

Security Architecture Review Checklist

When designing or reviewing a system, evaluate each area:

□ Authentication
  ├── How are users/services identified?
  ├── Token lifecycle (issuance, validation, revocation, rotation)?
  └── MFA requirements?

□ Authorization
  ├── Where are access decisions made (gateway, service, DB)?
  ├── RBAC vs ABAC vs ReBAC?
  └── Least privilege verification?

□ Network Security
  ├── Trust boundaries identified?
  ├── East-west encryption (mTLS)?
  ├── Egress filtering?
  └── Microsegmentation?

□ Data Protection
  ├── Encryption at rest and in transit?
  ├── Key management (rotation, access)?
  ├── Data classification applied?
  └── PII handling (GDPR Art. 25 Privacy by Design)?

□ Input Handling
  ├── Validation at every trust boundary?
  ├── Serialization security (XXE, deserialization)?
  └── File upload controls?

□ Resilience
  ├── Rate limiting at multiple layers?
  ├── Circuit breakers for downstream dependencies?
  ├── Timeout hierarchy (caller > callee)?
  ├── Graceful degradation plan?
  └── Resource limits (CPU, memory, connections)?

□ Observability
  ├── Security-relevant log coverage?
  ├── Correlation IDs across services?
  ├── Alerting on auth failures, policy violations?
  └── Audit trail for sensitive operations?

□ Supply Chain
  ├── Dependency scanning in CI/CD?
  ├── Image signing and verification?
  ├── SBOM generation?
  └── Third-party resource integrity (SRI)?

□ Multi-Tenancy (if applicable)
  ├── Isolation model chosen and justified?
  ├── Tenant context propagation?
  ├── Cross-tenant access testing?
  └── Noisy neighbor prevention?

□ Blast Radius
  ├── What does compromise of component X give the attacker?
  ├── Can lateral movement be contained?
  ├── Are secrets scoped to minimum necessary?
  └── Is there a kill switch for compromised components?

Threat Modeling Integration

Every architecture decision should be validated through STRIDE analysis:

Threat Question
Spoofing Can an attacker impersonate a user or service?
Tampering Can data be modified in transit or at rest?
Repudiation Can actions be denied without audit evidence?
Information Disclosure What data leaks on compromise of each component?
Denial of Service What happens under load or resource exhaustion?
Elevation of Privilege Can a low-privilege actor escalate?

Map each finding to mitigations, owners, and implementation status. Prioritize by blast radius and exploitability.


Summary: The Principal Architect's Mental Model

Security architecture is not a checklist bolted onto a design. It is a set of constraints that shape the design from inception:

  1. Identity is the perimeter — network location grants nothing; every request proves identity.
  2. Every boundary validates — gateway, service, database each enforce independently.
  3. Blast radius drives topology — segment by damage potential, not by convenience.
  4. Resilience is security — DoS, cascading failure, and resource exhaustion are attack vectors.
  5. Observability enables defense — you cannot defend what you cannot see.
  6. Least privilege is not optional — default-deny at every layer, justify every permission.
  7. Assume breach, design for containment — the question is not "if" but "when" and "how far."
NextThreat Modeling

On this page

  • Table of Contents
  • 1. Defense-in-Depth Patterns
  • The Layered Model
  • Three Control Types (Google Cloud Blueprint Model)
  • Principles
  • 2. Zero Trust Network Architecture
  • Core Tenets
  • Network Architecture Pattern
  • Google Cloud Blueprint Implementation
  • Microsegmentation
  • 3. Microservices Security Architecture
  • Authorization Layers
  • Authorization Patterns
  • Identity Propagation
  • Security Architecture Documentation
  • Logging Architecture
  • 4. Service Mesh and mTLS
  • Mutual TLS (mTLS)
  • Service Mesh Benefits
  • Service Mesh Trade-offs
  • Token-Based Service Authentication (Alternative to mTLS)
  • 5. API Gateway Security
  • Gateway as Security Perimeter
  • Gateway Limitations
  • Mitigation Pattern
  • 6. Database Security Architecture
  • Network Isolation
  • Authentication and Access Control
  • Least Privilege in Practice
  • Credential Management
  • Transport Security
  • Hardening Checklist
  • 7. Container Security (Docker)
  • Defense-in-Depth Stack for Containers
  • Critical Anti-Patterns
  • Rootless Mode
  • Alternative: Podman
  • 8. Kubernetes Security Architecture
  • Multi-Layer Security Model
  • Pod Security Standards
  • etcd Security
  • API Server Authentication
  • Container Sandboxing
  • Kubelet Security
  • 9. CSP, CORS, and Same-Origin Policy
  • Same-Origin Policy (SOP)
  • Content Security Policy (CSP)
  • CORS Security
  • 10. WebSocket Security
  • Transport Security
  • Cross-Site WebSocket Hijacking (CSWSH) Prevention
  • Message-Level Security
  • Per-Action Authorization
  • DoS Mitigation
  • Logging
  • 11. GraphQL Security Controls
  • Query Abuse Prevention
  • Schema Exposure Controls
  • Authorization Architecture
  • Batching Attack Mitigation
  • Persisted Queries
  • Input Validation
  • 12. XML and Serialization Security
  • XXE Prevention
  • XML Bomb Prevention
  • Schema Hardening
  • Schema Poisoning Defense
  • General Serialization Security
  • 13. SSRF Prevention Architecture
  • Defense-in-Depth: Application + Network Layers
  • Application Layer Controls
  • Network Layer Controls
  • Cloud Metadata Protection
  • DNS Rebinding Prevention
  • 14. Rate Limiting Architecture
  • Multi-Layer Rate Limiting
  • Rate Limiting Algorithms
  • Slow HTTP Attack Defense
  • DoS Resilience Patterns
  • 15. Circuit Breaker and Resilience Patterns
  • Circuit Breaker Pattern
  • Configuration Parameters
  • Bulkhead Pattern
  • Retry with Backoff
  • Timeout Hierarchy
  • 16. Race Condition Defense
  • TOCTOU (Time-of-Check-to-Time-of-Use)
  • Defense Patterns
  • Idempotency Pattern
  • Distributed Systems Considerations
  • 17. Secure Multi-Tenant Design
  • Isolation Models
  • Kubernetes Multi-Tenant Pattern
  • Cross-Cutting Multi-Tenant Controls
  • Noisy Neighbor Prevention
  • Data Isolation Verification
  • 18. CSS Security
  • Attack Surface
  • Defenses
  • 19. Architectural Decision Framework
  • Security Architecture Review Checklist
  • Threat Modeling Integration
  • Summary: The Principal Architect's Mental Model