Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.riad.com.bd/llms.txt

Use this file to discover all available pages before exploring further.

Production-Ready Kubernetes Deployment Guide

This guide covers the complete journey from bare-metal hardware to a production-hardened Kubernetes platform. Every tool is selected for a specific operational purpose, with hardware requirements, benefits, trade-offs, and integration points clearly explained.

Table of Contents

  1. Hardware Requirements
  2. Cluster Provisioning Tools
  3. Networking Layer
  4. Load Balancing Layer
  5. Ingress Control
  6. Storage Layer
  7. Security & Identity
  8. Artifact Management
  9. GitOps & CI/CD
  10. Observability Stack
  11. Complete Deployment Sequence
  12. Operational Runbooks
  13. Architecture Diagram

1. Hardware Requirements

Before installing software, you need properly sized hardware. Kubernetes is resource-intensive, and every additional component (observability, storage, GitOps) adds overhead.

Control Plane Nodes

The control plane runs the API server, scheduler, controller manager, and etcd. It is the brain of the cluster. If the control plane fails, the cluster stops accepting changes (though running pods continue).
SpecMinimumProduction RecommendedWhy It Matters
Nodes13 (HA)etcd requires a quorum (majority). 3 nodes tolerate 1 failure. 5 nodes tolerate 2 failures.
CPU2 cores4–8 coresAPI server handles all cluster operations; etcd is CPU-intensive during writes.
RAM4 GB16–32 GBetcd stores the entire cluster state in memory. Insufficient RAM causes OOM kills and quorum loss.
Disk100 GB SSD200 GB+ NVMe SSDetcd is latency-sensitive. Slow disks cause API server timeouts and scheduling delays.
Network1 Gbps10 GbpsControl plane nodes constantly communicate with each other and all workers.
Critical: etcd data must be on a dedicated disk or partition. Never share the etcd disk with log files or container storage. etcd uses the disk as a write-ahead log — contention kills performance.

Worker Nodes

Worker nodes run your applications (pods). They need resources proportional to your workload density.
SpecMinimumProduction RecommendedWhy It Matters
Nodes13+ (N+1 redundancy)You need at least one spare node to absorb load during maintenance or failures.
CPU4 cores16–64 coresMore cores = more pods per node. Kubernetes default limit is 110 pods per node.
RAM8 GB64–256 GBMemory is the most common bottleneck. Each pod reservation reduces allocatable capacity.
Disk (OS)100 GB SSD200 GB SSDHosts container images, logs, and kubelet state.
Disk (Storage)500 GB2–10 TB raw per nodeFor Longhorn or local PVs. Must be unused raw disk or dedicated mount points.
Network1 Gbps10 GbpsPod-to-pod traffic, storage replication, and image pulls consume significant bandwidth.

Load Balancer Nodes (HAProxy + Keepalived)

These sit outside the Kubernetes cluster and provide the external entry point.
SpecMinimumProduction RecommendedWhy It Matters
Nodes12 (HA with Keepalived)Single LB is a single point of failure. Keepalived provides VIP failover.
CPU2 cores4 coresTLS termination and L7 routing are CPU-intensive at high throughput.
RAM4 GB8 GBConnection tracking tables consume memory.
Network1 Gbps10 GbpsAll external traffic passes through these nodes.

Observability Nodes

Mimir, Loki, and Tempo are resource-hungry. Depending on cluster size, these may run on dedicated nodes or the worker pool.
ComponentRAMCPUDiskNotes
Mimir2 GB per million series2 cores100 GB local + object storageTSDB head and WAL are memory-intensive.
Loki4–16 GB4–8 cores50 GB local + object storageQuery parallelism drives memory usage.
Tempo4–16 GB4–8 cores50 GB local + object storageTrace ingestion rate determines memory.
Grafana512 MB–2 GB1–2 cores10 GBLightweight UI; increases with concurrent users.
Rule of thumb: For a 50-node cluster with 1,000 pods, allocate 32 GB RAM and 8 cores for the observability stack.

Total Cluster Sizing Example

RoleCountCPURAMDiskPurpose
Control Plane38 cores32 GB200 GB NVMeAPI server, etcd, scheduler
Worker532 cores128 GB500 GB SSD + 4 TB rawApplication workloads, Longhorn storage
Load Balancer24 cores8 GB100 GB SSDHAProxy + Keepalived
Total10220 cores796 GB~25 TB

2. Cluster Provisioning Tools

You cannot install Dex or Traefik without a cluster. The provisioning tool you choose determines how you bootstrap, upgrade, and lifecycle-manage the entire platform.

Tool Comparison

ToolProsConsBest For
KubeadmThe official “standard”; very educational; maximum flexibility; works on any infrastructureHard to manage upgrades and scaling over time; manual etcd backup; no built-in node lifecycle managementLearning Kubernetes internals; one-off clusters; environments where you need full control
RKE2 (Rancher)Built for government and security; FIPS 140-2 compliant; CIS-hardened by default; embedded etcd; automated upgrades; air-gapped supportOpinionated about how things are run; Rancher ecosystem lock-in; less community diversity than kubeadmRegulated industries; security-first organizations; air-gapped environments; teams wanting “batteries included”
Cluster API (CAPI)The “pro” way to manage Kubernetes; uses one Kubernetes cluster (management cluster) to manage other clusters (workload clusters); declarative infrastructure as code; multi-cloud abstractionHigh learning curve; complex initial setup; requires deep understanding of Kubernetes primitives; provider-specific quirksPlatform teams managing 10+ clusters; multi-cloud or hybrid cloud; GitOps-driven infrastructure; organizations treating clusters as cattle, not pets

Why These Tools Matter

ConcernWithout a Provisioning ToolWith a Provisioning Tool
BootstrapManual OS installation, binary downloads, certificate generation, etcd cluster formationSingle command or manifest creates a working cluster
UpgradesManual coordination: drain nodes, upgrade packages, restart services, verify API compatibilityRolling upgrades with automated health checks and rollback
ScalingManual VM provisioning, network configuration, kubeadm join commandsDeclarative scaling: change a number in a manifest, the tool provisions and joins
RecoveryManual etcd restore from backup, certificate regeneration, node re-provisioningAutomated node replacement, etcd snapshot restoration
ConsistencySnowflake clusters with different configurationsIdentical clusters from the same template

Kubeadm: The Foundation

Why it is required: Kubeadm is the official Kubernetes bootstrapping tool. It is the reference implementation that other tools (RKE2, CAPI) build upon. Understanding kubeadm means understanding how Kubernetes actually works. Benefits:
  • Transparency: You see every certificate, static pod manifest, and kubeconfig file.
  • Portability: Works on bare metal, VMs, cloud instances, and Raspberry Pis.
  • Flexibility: Customize every API server flag, etcd parameter, and kubelet configuration.
Typical Workflow:
# On control plane node
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint=192.168.1.10
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

# On worker nodes
sudo kubeadm join 192.168.1.10:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>
Operational Reality:
  • Upgrades require manual coordination: upgrade control plane nodes one by one, then workers.
  • etcd backups are your responsibility (etcdctl snapshot save).
  • Node replacement is manual: drain, delete, provision new VM, join.

RKE2: Security-First Distribution

Why it is required: When your organization operates under regulatory requirements (government, healthcare, finance), you need a distribution that is certified and hardened out of the box. RKE2 provides this without manual hardening scripts. Benefits:
  • FIPS 140-2 Compliance: Uses FIPS-validated cryptographic modules. Required for US government workloads.
  • CIS Hardening: Applies Center for Internet Security benchmarks automatically.
  • Embedded etcd: No separate etcd cluster to manage. Simplifies backup and recovery.
  • Air-Gapped Support: Can be installed entirely from tarballs without internet access.
  • Automated Upgrades: Via Rancher’s system-upgrade-controller; plans upgrades across nodes.
Key Differences from Standard Kubernetes:
  • Uses containerd by default (no Docker dependency).
  • Runs etcd as an embedded process (not a static pod).
  • Configuration is via /etc/rancher/rke2/config.yaml (not flags).
Typical Workflow:
# Install RKE2 server (control plane)
curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_TYPE=server sh -
sudo systemctl enable rke2-server --now

# Install RKE2 agent (worker)
curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_TYPE=agent sh -
sudo systemctl enable rke2-agent --now

Cluster API (CAPI): The Professional Approach

Why it is required: When you manage tens or hundreds of clusters, manual provisioning becomes impossible. CAPI brings the Kubernetes declarative model (desired state, controllers, reconciliation) to cluster infrastructure itself. Benefits:
  • Declarative Infrastructure: Define clusters as YAML manifests stored in Git.
  • GitOps Integration: ArgoCD or Flux can manage your cluster definitions.
  • Multi-Cloud Abstraction: Same manifests work across vSphere, AWS, Azure, and OpenStack.
  • Automated Lifecycle: Creation, scaling, upgrade, and deletion are all automated.
Architecture:
Management Cluster (Kubernetes)
├── CAPI Core Provider
├── Infrastructure Provider (vSphere, AWS, Azure)
├── Bootstrap Provider (kubeadm, RKE2)
└── Control Plane Provider

    Creates & Manages

Workload Cluster A (Production)
Workload Cluster B (Staging)
Workload Cluster C (DR)
Typical Workflow:
# 1. Create a management cluster (can be a single-node KinD cluster)
kind create cluster --name management

# 2. Initialize Cluster API with infrastructure provider
clusterctl init --infrastructure vsphere

# 3. Define workload cluster manifests and apply
kubectl apply -f workload-cluster.yaml

# 4. Fetch kubeconfig for workload cluster
clusterctl get kubeconfig workload-cluster > workload.kubeconfig

3. Networking Layer

Component: Cilium

Why it is required: The default Kubernetes networking (kube-proxy + iptables) is functional but slow, opaque, and lacks security features. Cilium replaces this with eBPF, providing high-performance networking, zero-trust security, and deep observability in one component. Benefits:
BenefitExplanation
eBPF Data PlaneKernel-level packet processing bypasses iptables overhead. 10x faster for large-scale clusters.
L3/L4/L7 Network PoliciesRestrict traffic by IP, port, HTTP path, method, or headers. Default-deny policies prevent lateral movement.
Hubble ObservabilitySee every network flow, DNS query, and policy verdict in real-time. No additional tools needed.
Cluster MeshConnect multiple Kubernetes clusters into a single flat network with service discovery.
Service Mesh (Sidecar-less)mTLS, traffic management, and observability without injecting sidecars into every pod.
Bandwidth ManagerOptimizes TCP and UDP throughput for high-performance workloads.
NodePort AccelerationDirect server return (DSR) reduces latency for LoadBalancer and NodePort services.
Architecture:
Pod A (Namespace: frontend)

Cilium CNI (eBPF program attached to pod's veth)

Cilium Agent (DaemonSet on every node)

Cilium Operator (manages IPAM, identity allocation)

Linux Kernel (eBPF maps for connection tracking, policy enforcement)

Pod B (Namespace: backend) or External Network
Deployment Considerations:
  1. Kernel Requirements: Linux kernel 4.19+ (5.10+ recommended).
  2. Kube-Proxy Replacement: Cilium can fully replace kube-proxy for better performance: --set kubeProxyReplacement=strict.
  3. Encryption: Enable WireGuard for pod-to-pod encryption: --set encryption.enabled=true --set encryption.type=wireguard.
  4. IPAM Mode: For VMware/vSphere, use cluster-pool (Cilium-managed pod CIDR).

4. Load Balancing Layer

External Load Balancing: HAProxy + Keepalived

Why it is required: In bare-metal or private cloud, there is no cloud provider to provision a load balancer in front of your cluster. You need a highly available entry point that distributes traffic across multiple Traefik instances and survives node failures. Benefits:
BenefitExplanation
Layer 4 & 7 BalancingTCP passthrough for TLS or HTTP termination with header inspection.
Health ChecksAutomatically removes failed backends from the pool. No traffic sent to dead nodes.
VIP FailoverKeepalived moves the virtual IP to a standby node in < 3 seconds using VRRP.
SSL OffloadingHAProxy terminates TLS, reducing CPU load on Traefik and application pods.
Sticky SessionsSession affinity for stateful applications that require consistent backend connections.
Statistics UIBuilt-in web dashboard for real-time traffic monitoring and troubleshooting.
Architecture:
Internet / Corporate Network

    Virtual IP (VIP): 192.168.1.10

    Keepalived (VRRP Master)

    HAProxy (Active)
    ├─→ Traefik Node 1:30443
    ├─→ Traefik Node 2:30443
    └─→ Traefik Node 3:30443

    Keepalived (VRRP Backup — takes over if Master fails)
Hardware Placement: Run HAProxy + Keepalived on dedicated VMs or control plane nodes (not as pods — they must survive cluster failures).

Internal Load Balancing: MetalLB

Why it is required: Kubernetes Services of type LoadBalancer expect a cloud provider to provision an IP. On bare metal, this integration does not exist. Without MetalLB, you are limited to NodePort services with random high-numbered ports (30000–32767), which is unacceptable for production. Benefits:
BenefitExplanation
Standard Service TypesSupports LoadBalancer services natively, just like AWS or GCP.
Predictable IPsAdmin-defined IP pools. Services get stable, known addresses.
Automatic FailoverIf the node holding a LoadBalancer IP fails, MetalLB reassigns it to a healthy node.
BGP IntegrationAnnounces IPs to corporate routers via BGP for true anycast distribution.
Shared IPsMultiple services can share one IP on different ports, conserving address space.
Without MetalLB vs. With MetalLB:
FeatureWithout MetalLBWith MetalLB
Service TypeLimited to NodePort (e.g., IP:32001)Supports LoadBalancer (e.g., IP:443)
User ExperienceUsers must remember weird, high-numbered portsUsers use standard HTTP/HTTPS ports (80/443)
FailoverManual intervention or DNS-based failoverAutomatic IP migration between healthy nodes
IP ManagementNo central pool; ports randomly assignedAdmin-defined IP pools; predictable addresses

5. Ingress Control

Component: Traefik + Gateway API

Why it is required: Kubernetes needs an ingress controller to route external HTTP/HTTPS traffic to internal services. Traefik is modern, cloud-native, and natively supports the Gateway API — the next-generation standard that replaces the aging Ingress resource. Benefits:
BenefitExplanation
Gateway API NativeRole-based routing: platform team defines Gateway; app developers define HTTPRoute. No more annotation wars.
Middleware ChainRate limiting, circuit breakers, basic auth, OIDC, request/response rewriting — all declarative.
Traffic SplittingCanary deployments, A/B testing, blue-green rollouts with percentage-based weights.
Service Mesh IntegrationWorks with Cilium service mesh for mTLS and L7 policies without sidecars.
DashboardReal-time routing visualization, health checks, and error rates.
Certificate IntegrationNative integration with Cert-Manager for automatic TLS on Gateway listeners.
Ingress vs. Gateway API:
AspectKubernetes IngressGateway API
Resource ModelSingle Ingress resourceGateway + Route resources (separation of concerns)
RolesOne admin owns everythingPlatform team owns Gateway; developers own Routes
Protocol SupportHTTP/HTTPS onlyHTTP, HTTPS, TCP, TLS, UDP, gRPC
ExtensibilityAnnotations (messy and vendor-specific)Typed references and parameters (clean and standardized)
FutureFrozen; no new featuresActive development; recommended by SIG-Network

6. Storage Layer

Component: Longhorn

Why it is required: Kubernetes pods are ephemeral. Without persistent storage, databases, message queues, and file stores lose all data on restart. Longhorn provides replicated, snapshot-capable, backup-ready block storage for stateful workloads. Benefits:
BenefitExplanation
Replicated VolumesEach volume is synchronously replicated across 3+ worker nodes. Survives node failure without data loss.
Snapshots & BackupsPoint-in-time snapshots for quick recovery; incremental backups to S3/NFS for disaster recovery.
Thin ProvisioningAllocate storage on demand. No wasted reserved space.
Cross-AZ RecoveryReplicas distributed across availability zones for rack-aware HA.
UI ManagementBuilt-in web UI for volume status, backup jobs, and node health monitoring.
Non-Disruptive UpgradesRolling upgrades of Longhorn components without volume downtime.
Hardware Requirements:
  • Each worker node must have unused raw disk space or a dedicated mount point.
  • Open-iscsi must be installed on every node (apt install open-iscsi).
  • nfs-common required for NFS backup targets.

7. Security & Identity

Component: Cert-Manager

Why it is required: TLS certificates expire. Manual certificate management in a dynamic Kubernetes environment is a guaranteed outage. Cert-Manager automates issuance, renewal, and injection of certificates from Let’s Encrypt, Vault, and private CAs. Benefits:
BenefitExplanation
Automatic IssuanceRequest certificates via Kubernetes resources (Issuer + Certificate). No manual CSR generation.
Auto-RenewalMonitors expiry and renews certificates 30 days before expiration. Zero-touch maintenance.
Let’s Encrypt IntegrationFree, trusted certificates with HTTP-01 or DNS-01 challenge support.
Gateway/Ingress IntegrationAutomatically injects certificates into Traefik Gateway listeners and Ingress resources.
Multiple IssuersLet’s Encrypt for public, Vault for internal, private CA for mTLS — all in one cluster.

Component: Dex (OIDC)

Why it is required: Kubernetes does not authenticate users — it validates tokens. Without an identity bridge, every user needs a manually distributed kubeconfig with embedded certificates. Dex connects Kubernetes to your existing corporate identity system (LDAP, Okta, Azure AD). Benefits:
BenefitExplanation
SSO IntegrationUsers authenticate with existing corporate credentials. No separate Kubernetes passwords.
Group MappingLDAP groups or SAML roles map directly to Kubernetes RBAC groups. Admin access is automatic.
Token RefreshHandles refresh tokens so users do not re-authenticate every few hours.
Audit TrailAll authentication events flow through your central identity provider.
No User Management in K8sUsers are not Kubernetes objects. Onboarding/offboarding happens in LDAP/Okta.

8. Artifact Management

Component: Nexus Repository

Why it is required: Building containers and pulling dependencies from the public internet on every CI run is slow, unreliable, and insecure. Nexus provides a local cache and private host for all artifacts — Docker images, Helm charts, npm packages, Maven dependencies. Benefits:
BenefitExplanation
Universal Format SupportDocker, Helm, npm, Maven, PyPI, NuGet, Raw, APT, YUM — one tool for everything.
Proxy CachingCaches Docker Hub, Maven Central, npm registry. Survives external outages and avoids rate limits.
Private HostingInternal artifacts (proprietary libraries, base images) never leave your network.
Blob Storage BackendS3-compatible storage for scalable, durable artifact storage.
Cleanup PoliciesAutomatic deletion of old artifacts based on age or download count. Prevents unbounded growth.
RBACFine-grained permissions per repository and format. CI gets push access; developers get read access.

9. GitOps & CI/CD

Component: GitLab

Why it is required: You need a single source of truth for code, configuration, and operational knowledge. GitLab provides repository hosting, CI/CD pipelines, issue tracking, and documentation in one platform. Benefits:
  • Single Source of Truth: Code, manifests, runbooks, and issues all in one place.
  • CI/CD Native: .gitlab-ci.yml defines the entire build-test-deploy pipeline.
  • Container Registry: Built-in Docker registry (can mirror to Nexus).
  • Integration: Webhooks to ArgoCD, issue references in commits, merge request pipelines.

Component: ArgoCD

Why it is required: Manual kubectl apply is error-prone and un-auditable. ArgoCD ensures the live cluster state always matches the desired state stored in Git. It is the enforcement layer for GitOps. Benefits:
BenefitExplanation
Declarative SyncGit is the single source of truth. Drift is automatically detected and corrected.
ApplicationSetsDeploy the same application to multiple environments (dev/staging/prod) from one template.
RBAC IntegrationSync with Dex/OIDC for SSO; granular project-level permissions.
RollbackOne-click rollback to any previous Git commit. Instant disaster recovery.
Diff VisualizationWeb UI shows exactly what will change before syncing. No surprises.
Resource HooksPre-sync, post-sync, and sync-wave hooks for complex deployment ordering (database migration before app start).

Component: GitLab Runners

Why it is required: CI/CD jobs need compute resources. GitLab Runners provide dynamic, scalable build execution as Kubernetes pods. Benefits:
  • Autoscaling: Jobs spin up as pods and terminate after completion. No idle workers.
  • Kubernetes-Native: Runs inside the cluster it deploys to. No separate build farm needed.
  • Security: Build isolation via pod sandboxes. Compromised build does not affect other jobs.

10. Observability Stack

Component: LGTM Stack (Loki, Grafana, Tempo, Mimir)

Why it is required: Running a production cluster without observability is flying blind. You cannot debug what you cannot see. The LGTM stack provides unified metrics, logs, traces, and dashboards — all correlated. Benefits:
ComponentBenefitWhy It Matters
LokiLabel-based log indexing10x cheaper than Elasticsearch. Only indexes labels, not full text.
GrafanaUnified visualizationOne UI for metrics, logs, and traces. Click a metric spike → see logs → trace the request.
TempoDistributed tracingFollow a request across 20 microservices. Find latency bottlenecks and failure points.
MimirHorizontally scalable metricsReplaces Prometheus for large clusters. Object storage backend = years of retention.
Additional Components:
ComponentPurposeWhy Required
PromtailLog collection DaemonSetTails container logs from /var/log/pods/ and pushes to Loki. Required because Loki does not pull logs — something must push them.
OpenTelemetry CollectorTrace ingestion and processingReceives traces from applications, batches them, transforms formats (Jaeger → Tempo), and forwards. Required for vendor-neutral instrumentation.
AlertmanagerAlert routing and groupingDeduplicates alerts, groups by severity, routes to Slack/PagerDuty/email. Required because raw Prometheus alerts would spam channels.
Node ExporterHardware/OS metricsExposes CPU, memory, disk, network metrics from every node. Required for cluster capacity planning.
cAdvisorContainer metricsExposes per-container resource usage and performance. Required for pod-level resource optimization.
kube-state-metricsKubernetes object metricsExposes metrics about deployments, pods, nodes, PVCs (not just their resource usage). Required for cluster health dashboards.

11. Complete Deployment Sequence

Phase 1: Infrastructure Provisioning

StepActionVerification
1Provision VMs: 3 control plane, 3+ workers, 2 LB nodesping all nodes; SSH access works
2Install OS (Ubuntu 22.04 LTS or RHEL 9)cat /etc/os-release
3Configure networking: static IPs, DNS, NTPip addr, nslookup google.com, timedatectl
4Prepare storage: mount raw disks for Longhornlsblk shows unused disks
5Install prerequisites: containerd, open-iscsi, nfs-commonsystemctl status containerd

Phase 2: Cluster Bootstrap

StepActionVerification
6Choose provisioning tool (Kubeadm / RKE2 / CAPI)Document the decision
7Bootstrap control plane nodeskubectl get nodes shows control planes Ready
8Join worker nodesAll workers show Ready
9Install Cilium CNIPods communicate across nodes; Hubble UI works
10Verify CoreDNSnslookup kubernetes.default from a pod
11Install metrics-serverkubectl top nodes returns data

Phase 3: Load Balancing Foundation

StepActionVerification
12Install MetalLBkubectl get pods -n metallb-system all Running
13Configure IPAddressPool and advertisementkubectl get ipaddresspool
14Test LoadBalancer serviceEXTERNAL-IP assigned and reachable
15Configure HAProxy + KeepalivedVIP responds; failover works
16Point DNS to VIPnslookup apps.company.com resolves to VIP

Phase 4: Storage Foundation

StepActionVerification
17Install Longhornkubectl get pods -n longhorn-system all Running
18Verify StorageClasskubectl get sc shows longhorn
19Test PVC provisioningVolume binds, pod mounts, data persists

Phase 5: Security Layer

StepActionVerification
20Install Cert-Managerkubectl get pods -n cert-manager all Running
21Create ClusterIssuer (staging)kubectl describe clusterissuer shows Ready
22Test certificate issuanceSecret created successfully
23Switch to production issuerValid Let’s Encrypt cert issued

Phase 6: Ingress Control

StepActionVerification
24Install Traefikkubectl get pods -n traefik all Running
25Configure GatewayClasskubectl get gatewayclass shows traefik
26Test Gateway + HTTPRouteTraffic reaches backend via standard ports
27Integrate Cert-Manager with GatewayHTTPS works with valid certificate

Phase 7: Artifact Management

StepActionVerification
28Install Nexus RepositoryWeb UI accessible
29Configure Docker registrydocker push and docker pull work
30Configure Helm repositoryhelm push and helm install work
31Configure proxy repositoriesExternal packages cache successfully

Phase 8: Identity Layer

StepActionVerification
32Install Dexkubectl get pods -n dex all Running
33Configure LDAP/Okta/Azure AD connectorDex logs show successful binds
34Configure API server OIDC flagsControl plane restarted
35Test authenticationkubelogin obtains valid token
36Configure RBAC bindingsUsers in admin group have cluster-admin

Phase 9: GitOps

StepActionVerification
37Install ArgoCDkubectl get pods -n argocd all Running
38Configure GitLab repository accessArgoCD syncs from GitLab
39Create first ApplicationApp syncs; resources created
40Configure ApplicationSetsMulti-environment deployment works

Phase 10: Observability Stack

StepActionVerification
41Install Prometheus / MimirTargets page shows healthy scrapes
42Install Node Exporter + cAdvisor + kube-state-metricsMetrics appear in Prometheus
43Install Loki + PromtailLogs appear in Grafana Explore
44Install Tempo + OpenTelemetry CollectorTraces appear in Grafana Explore
45Install GrafanaUI accessible via HTTPS
46Configure data sources and dashboardsDashboards show live data
47Configure AlertmanagerTest alerts reach Slack/PagerDuty

Phase 11: CI/CD Integration

StepActionVerification
48Install GitLab RunnersRunners register; jobs execute as pods
49Configure pipeline stagesAll stages complete successfully
50Test end-to-end GitOps flowGit push → CI builds → updates manifest → ArgoCD syncs → app deploys

12. Operational Runbooks

Runbook: Adding a New Worker Node

# 1. Provision VM with required specs (CPU, RAM, disk)
# 2. Install OS and prerequisites (containerd, open-iscsi)
# 3. Join the cluster
kubeadm join <control-plane-endpoint>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>

# 4. Verify
kubectl get nodes
kubectl label node <new-node> node-role.kubernetes.io/worker=worker

# 5. Cilium auto-detects and programs eBPF
# 6. Longhorn auto-detects and schedules replicas
kubectl get pods -n longhorn-system

Runbook: Replacing a Failed Node

# 1. Cordon the node
kubectl cordon <failed-node>

# 2. Drain the node
kubectl drain <failed-node> --ignore-daemonsets --delete-emptydir-data

# 3. Remove from cluster
kubectl delete node <failed-node>

# 4. Longhorn rebuilds replicas; MetalLB reassigns IPs; HAProxy removes from pool
# 5. Provision replacement and join

Runbook: Certificate Expiry Emergency

# 1. Check certificate status
kubectl get certificates -A
kubectl describe certificate <name> -n <namespace>

# 2. Force re-issuance if stuck
kubectl delete certificaterequest <request-name> -n <namespace>

# 3. Verify renewal
kubectl get secret <tls-secret> -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -dates

Runbook: Storage Volume Degraded

# 1. Check Longhorn volumes
kubectl get volumes -n longhorn-system

# 2. Identify under-replicated volumes
# 3. If node is down, wait or delete failed replica
# 4. Monitor rebuild in Longhorn UI

Runbook: MetalLB IP Not Responding

# 1. Check speaker pods
kubectl get pods -n metallb-system
kubectl logs -n metallb-system -l component=speaker

# 2. Verify IPAddressPool
kubectl get ipaddresspool -n metallb-system -o yaml

# 3. Check service EXTERNAL-IP
kubectl get svc <service-name>

# 4. For Layer 2: Check ARP table
arp -a | grep <external-ip>

# 5. For BGP: Check router peering status

Runbook: Traefik Gateway Not Routing

# 1. Check Traefik pods
kubectl get pods -n traefik
kubectl logs -n traefik -l app=traefik

# 2. Verify GatewayClass
kubectl get gatewayclass traefik -o yaml

# 3. Check Gateway status
kubectl get gateway external-gateway -n ingress -o yaml

# 4. Verify HTTPRoute parentRefs
kubectl get httproute <route-name> -n <namespace> -o yaml

# 5. Check backend endpoints
kubectl get endpoints <service-name> -n <namespace>

Runbook: Cilium Network Policy Blocking Traffic

# 1. Check Hubble for drops
hubble observe --verdict DROPPED --namespace <namespace>

# 2. Check policy rules
kubectl get ciliumpolicies -n <namespace> -o yaml

# 3. Verify pod labels
kubectl get pods -n <namespace> --show-labels

# 4. Temporarily allow all for debugging
kubectl delete ciliumpolicy <policy-name> -n <namespace>

# 5. Check Cilium agent logs
kubectl logs -n kube-system -l app=cilium

Runbook: Observability Stack Down

# 1. Check component health
kubectl get pods -n observability

# 2. Check resources
kubectl top pods -n observability

# 3. Check PVC status
kubectl get pvc -n observability

# 4. If OOMKilled, increase limits
kubectl patch statefulset mimir -n observability --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/memory", "value":"8Gi"}]'

# 5. Check Prometheus targets
kubectl port-forward svc/prometheus -n observability 9090:9090

Runbook: GitOps Sync Failure

# 1. Check application status
argocd app get <app-name>

# 2. View sync errors
argocd app sync <app-name> --dry-run

# 3. Check for resource conflicts
kubectl get events -n <app-namespace>

# 4. Check for drift
argocd app diff <app-name>

# 5. Force sync if needed
argocd app sync <app-name> --force

13. Architecture Diagram

Below is the complete platform architecture showing how all components integrate:
┌─────────────────────────────────────────────────────────────────────────────────────┐
│                              EXTERNAL USERS / INTERNET                                │
└─────────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────────┐
│                         LOAD BALANCING LAYER (External)                             │
│  ┌─────────────────┐         ┌─────────────────┐                                    │
│  │  HAProxy Node 1 │◄───────►│  HAProxy Node 2 │  (Active/Backup via Keepalived)   │
│  │  + Keepalived   │  VRRP   │  + Keepalived   │  VIP: 192.168.1.10                │
│  │  (MASTER)       │         │  (BACKUP)       │                                    │
│  └────────┬────────┘         └────────┬────────┘                                    │
│           │                           │                                             │
│           └──────────────┬────────────┘                                             │
│                          ▼                                                          │
│              ┌─────────────────────┐                                                  │
│              │   Virtual IP (VIP)  │  ← Single entry point, automatic failover      │
│              │   192.168.1.10      │                                                  │
│              └─────────────────────┘                                                  │
└─────────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────────┐
│                         INGRESS & SERVICE MESH LAYER                                │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                        Traefik Ingress Controller                            │   │
│  │  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐                     │   │
│  │  │  Gateway    │    │  HTTPRoute   │    │ Middleware  │  (Rate limit, auth) │   │
│  │  │  (Platform) │    │  (App Team)  │    │  (Shared)   │                     │   │
│  │  └─────────────┘    └─────────────┘    └─────────────┘                     │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                          │                                          │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                        Cilium Service Mesh (eBPF)                            │   │
│  │  • mTLS between services    • L7 policies    • Traffic management           │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────────┐
│                         KUBERNETES CLUSTER (VMware VMs)                             │
│                                                                                     │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     CONTROL PLANE (3 VMs, HA)                                │   │
│  │  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐                     │   │
│  │  │  CP Node 1  │◄──►│  CP Node 2  │◄──►│  CP Node 3  │                     │   │
│  │  │  API Server │ etcd│  API Server │ etcd│  API Server │ etcd                │   │
│  │  │  Scheduler  │Quorum│  Scheduler │Quorum│  Scheduler │Quorum               │   │
│  │  │  Controller │    │  Controller │    │  Controller │                     │   │
│  │  └─────────────┘    └─────────────┘    └─────────────┘                     │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                          │                                          │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     Cilium CNI (eBPF Data Plane)                           │   │
│  │  • High-performance networking  • Network policies  • Hubble observability    │   │
│  │  • Cluster mesh  • Bandwidth manager  • NodePort acceleration               │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                          │                                          │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     WORKER NODES (N+1 VMs)                                   │   │
│  │  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐                     │   │
│  │  │ Worker 1    │    │ Worker 2    │    │ Worker N    │                     │   │
│  │  │ ┌─────────┐ │    │ ┌─────────┐ │    │ ┌─────────┐ │                     │   │
│  │  │ │ Kubelet │ │    │ │ Kubelet │ │    │ │ Kubelet │ │                     │   │
│  │  │ │Containerd│ │    │ │Containerd│ │    │ │Containerd│ │                     │   │
│  │  │ │  Pod    │ │    │ │  Pod    │ │    │ │  Pod    │ │                     │   │
│  │  │ │  Pod    │ │    │ │  Pod    │ │    │ │  Pod    │ │                     │   │
│  │  │ └─────────┘ │    │ └─────────┘ │    │ └─────────┘ │                     │   │
│  │  │ Longhorn    │◄──►│ Longhorn    │◄──►│ Longhorn    │  (Replicated storage)│   │
│  │  │ (Local disk)│    │ (Local disk)│    │ (Local disk)│                     │   │
│  │  └─────────────┘    └─────────────┘    └─────────────┘                     │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                     │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     INTERNAL LOAD BALANCING (MetalLB)                        │   │
│  │  • LoadBalancer services on bare metal  • BGP or Layer 2 announcement        │   │
│  │  • Automatic failover  • Shared IPs across services                          │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                     │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     GITOPS & CI/CD                                           │   │
│  │  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐                     │   │
│  │  │   ArgoCD    │    │ GitLab      │    │   Nexus     │                     │   │
│  │  │  (GitOps)   │◄──►│  (Source)   │◄──►│ (Artifacts)│                     │   │
│  │  │  Syncs Git  │    │  CI/CD      │    │ Docker/Helm│                     │   │
│  │  │  to Cluster │    │  Pipelines  │    │ npm/Maven  │                     │   │
│  │  └─────────────┘    └─────────────┘    └─────────────┘                     │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                     │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     SECURITY & IDENTITY                                      │   │
│  │  ┌─────────────┐    ┌─────────────┐                                        │   │
│  │  │ Dex (OIDC)  │    │ Cert-Manager│                                        │   │
│  │  │  • LDAP     │    │  • Let's    │                                        │   │
│  │  │  • Okta     │    │    Encrypt  │                                        │   │
│  │  │  • Azure AD │    │  • Vault    │                                        │   │
│  │  │  • Groups   │    │  • Auto     │                                        │   │
│  │  │    mapping  │    │    renewal  │                                        │   │
│  │  └─────────────┘    └─────────────┘                                        │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                     │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     OBSERVABILITY (LGTM Stack)                               │   │
│  │                                                                               │   │
│  │  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐   │   │
│  │  │   Grafana   │◄──►│   Mimir     │◄──►│    Loki     │◄──►│   Tempo     │   │   │
│  │  │  (Unified   │    │  (Metrics)  │    │  (Logs)     │    │  (Traces)   │   │   │
│  │  │   Dashboard)│    │             │    │             │    │             │   │   │
│  │  └─────────────┘    └─────┬───────┘    └─────┬───────┘    └─────┬───────┘   │   │
│  │                           │                  │                  │           │   │
│  │                    ┌──────┴──────┐    ┌──────┴──────┐    ┌──────┴──────┐   │   │
│  │                    │  Prometheus │    │  Promtail   │    │ OpenTelemetry│   │   │
│  │                    │   Agent     │    │  (DaemonSet)│    │  Collector   │   │   │
│  │                    │  (Scraping) │    │  (Log ship) │    │  (Ingestion) │   │   │
│  │                    └─────────────┘    └─────────────┘    └─────────────┘   │   │
│  │                                                                               │   │
│  │  Supporting: Node Exporter → cAdvisor → kube-state-metrics → Alertmanager    │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                     │
└─────────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────────┐
│                         VMWARE vSPHERE ENVIRONMENT                                    │
│                                                                                     │
│  ┌─────────────────────────────────────────────────────────────────────────────┐   │
│  │                     vCENTER MANAGEMENT                                         │   │
│  │  • VM provisioning  • vMotion/DRS  • Resource pools  • Templates             │   │
│  └─────────────────────────────────────────────────────────────────────────────┘   │
│           │                              │                              │           │
│           ▼                              ▼                              ▼           │
│  ┌─────────────────┐            ┌─────────────────┐            ┌─────────────────┐   │
│  │  ESXi Host 1    │◄──vMotion──►│  ESXi Host 2    │◄──vMotion──►│  ESXi Host N    │   │
│  │  (3-5 nodes)    │    DRS     │  (3-5 nodes)    │    DRS     │  (3-5 nodes)    │   │
│  │  • CPU/RAM/Disk │            │  • CPU/RAM/Disk │            │  • CPU/RAM/Disk │   │
│  │  • VM Anti-     │            │  • VM Anti-     │            │  • VM Anti-     │   │
│  │    Affinity     │            │    Affinity     │            │    Affinity     │   │
│  │  • HA/FT        │            │  • HA/FT        │            │  • HA/FT        │   │
│  └─────────────────┘            └─────────────────┘            └─────────────────┘   │
│                                                                                     │
│  Production Total: ~128GB+ RAM per host, shared storage (vSAN/VMFS), 10Gbps network│
└─────────────────────────────────────────────────────────────────────────────────────┘

Data Flow Summary

StepActionComponent
1Developer pushes code to GitLabGitLab
2GitLab CI builds image, runs tests, scans vulnerabilitiesGitLab Runners
3CI pushes Docker image to NexusNexus Repository
4CI updates deployment manifest (image tag) in GitGitLab
5ArgoCD detects Git change and syncs to clusterArgoCD
6Traefik routes external traffic to new podsTraefik + Gateway API
7Cilium enforces network policies and encrypts trafficCilium
8MetalLB assigns stable IP to the serviceMetalLB
9HAProxy + Keepalived distributes traffic from internetHAProxy + Keepalived
10Longhorn persists data with 3-way replicationLonghorn
11LGTM stack collects metrics, logs, and tracesMimir, Loki, Tempo, Grafana
12Cert-Manager ensures all TLS certificates are validCert-Manager
13Dex authenticates users against corporate LDAPDex

Summary Matrix

LayerComponentWhy RequiredManaged By
ProvisioningKubeadm / RKE2 / CAPICluster existence, upgrades, node lifecyclePlatform / Infrastructure Team
HardwareVMware vSphere + ESXiVirtualization, HA, vMotion, resource managementInfrastructure Team
NetworkingCiliumeBPF performance, security policies, observabilityPlatform Team
External LBHAProxy + KeepalivedVIP failover, Layer 4/7 load balancingInfrastructure Team
Internal LBMetalLBKubernetes LoadBalancer services on bare metalPlatform / Network Team
IngressTraefik + Gateway APIHTTP routing, TLS termination, traffic splittingPlatform Team
StorageLonghornStateful workloads, data durability, backupsPlatform Team
SecurityCert-ManagerAutomated TLS lifecyclePlatform Team
IdentityDex + RBACSSO, audit, team isolationSecurity / Platform Team
ArtifactsNexus RepositoryDocker images, Helm charts, package cachingDevOps / Platform Team
GitOpsArgoCDDeclarative deployments, drift correctionPlatform Team
CI/CDGitLab + RunnersBuild automation, security scanningDevOps Team
ObservabilityLGTM StackMTTR reduction, SLO compliancePlatform / SRE Team

Further Reading