Chapter 50: Kubernetes for AI Services

Docker gives you portable containers. Kubernetes orchestrates them—handling deployment, scaling, networking, and self-healing automatically. When your agent container crashes, Kubernetes restarts it. When traffic spikes, Kubernetes scales it. When you push a new version, Kubernetes rolls it out safely.

This chapter teaches Kubernetes comprehensively, covering everything from core primitives to production-ready patterns for AI services. You'll learn the declarative model that makes Kubernetes powerful, understand how the control plane manages desired state, and master the kubectl commands that make operations automated and repeatable.

By the end, your containerized FastAPI agent from Chapter 49 will be running on a Kubernetes cluster with proper health checks, resource management, security policies, and automated scaling—all without manual intervention.

What You'll Learn

By the end of this chapter, you'll be able to:

Understand Kubernetes architecture: Control plane, worker nodes, and the declarative model
Deploy containers on Kubernetes: Pods, Deployments, ReplicaSets, Jobs, CronJobs, and lifecycle management
Use advanced Pod patterns: Init containers for setup, sidecars for logging and monitoring
Organize workloads: Namespaces for isolation, resource quotas for multi-tenancy
Expose services: ClusterIP, NodePort, LoadBalancer, Ingress for HTTP routing
Configure applications: ConfigMaps for configuration, Secrets for sensitive data
Persist data: Persistent Volumes, Claims, StorageClasses, StatefulSets for stateful workloads
Manage resources: CPU/memory requests and limits, Horizontal Pod Autoscaler
Secure deployments: RBAC, SecurityContext, NetworkPolicy, Pod Security Standards
Monitor health: Liveness, readiness, and startup probes
Use kubectl-ai: AI-assisted manifest generation and cluster operations
Package with Helm: Charts, templates, and release management
Deploy agents at scale: Rolling updates, horizontal scaling, self-healing patterns

Chapter Structure

Lesson	Title	Layer	Focus
1	Kubernetes Architecture & the Declarative Model	L1	Control plane, workers, desired vs observed state
2	Setting Up Minikube	L1	Installation, cluster creation, kubectl context
3	Pods: The Atomic Unit	L1	Pod anatomy, YAML, lifecycle, multi-container
4	Deployments: Self-Healing at Scale	L1	ReplicaSets, rolling updates, rollbacks
5	Services & Networking	L1	ClusterIP, NodePort, LoadBalancer, DNS
6	Init Containers: Preparing Your Environment	L1	Initialization patterns, dependency setup
7	Sidecar Containers: Your Agent's Best Friend	L1	Native sidecars (K8s 1.28+), logging, metrics
8	Namespaces: Virtual Clusters	L1	Isolation, ResourceQuotas, LimitRanges
9	Ingress: Exposing Your Agent to the World	L1	Path/host routing, TLS, annotations
10	Service Discovery Deep Dive	L1	CoreDNS, FQDN, headless services
11	ConfigMaps & Secrets	L1	Configuration injection, security notes
12	Persistent Storage: PV and PVC	L1	Storage lifecycle, access modes, StorageClass
13	StatefulSets: When Your Agent Needs Identity	L1	Stable identity, volumeClaimTemplates
14	Resource Management & Debugging	L1	Requests/limits, QoS, kubectl debug
15	Horizontal Pod Autoscaler	L1	Metrics-server, CPU/memory scaling, behavior
16	RBAC: Securing Agent Deployments	L1	ServiceAccount, Role, RoleBinding
17	Kubernetes Security for AI Services	L1	SecurityContext, NetworkPolicy, Pod Security
18	Health Checks & Probes	L1	Liveness, readiness, startup probes
19	Jobs & CronJobs: Batch Workloads	L1	One-time tasks, scheduled jobs, parallelism
20	AI-Assisted K8s with kubectl-ai	L2	Natural language to manifests, debugging
21	Helm Charts for AI Agent Packaging	L1	Charts, templates, releases
22	Capstone: Deploy Your Agent to Kubernetes	L4	Spec-driven deployment, full validation
23	Building the Kubernetes Deployment Skill	L3	Persona + Questions + Principles

4-Layer Teaching Progression

This chapter follows the 4-Layer Teaching Method:

Lessons 1-19, 21 (Layer 1): Build mental models of Kubernetes concepts manually before AI assistance. This includes core primitives (Pods, Deployments, Services), advanced patterns (init containers, sidecars, StatefulSets), networking (Ingress, service discovery), storage, security (RBAC, NetworkPolicy), autoscaling, batch workloads (Jobs/CronJobs), and Helm packaging.
Lesson 20 (Layer 2): Collaborate with kubectl-ai using Three Roles to translate natural language into cluster operations. By this point, you have deep Kubernetes knowledge to evaluate AI-generated manifests critically.
Lesson 22 (Layer 4): Apply all lessons in a spec-driven capstone project. Deploy your Part 6 FastAPI agent with proper configuration, security, health checks, and resource management.
Lesson 23 (Layer 3): Create reusable intelligence—a Kubernetes deployment skill that compounds across cloud-native projects.

Prerequisites

Chapter 49 completion: A containerized FastAPI agent pushed to a registry—this is the container you'll deploy to Kubernetes
Docker familiarity: You should understand images, containers, and how to run containers locally
Basic command-line comfort: You should be comfortable running commands in a terminal and navigating file systems
No Kubernetes experience required: Lesson 1 explains cluster architecture and the declarative model from scratch

Your Part 6 Agent: The Thread Through This Chapter

Throughout this chapter, we deploy your Part 6 FastAPI agent to Kubernetes:

Lessons 1-19: Learn Kubernetes concepts using your agent as the running example—from basic Pod deployment to production-ready configurations with security, autoscaling, and health checks
Lesson 20: Use kubectl-ai to generate deployment manifests and troubleshoot cluster issues collaboratively
Lesson 21: Package your agent as a Helm chart for repeatable, environment-specific deployments
Lesson 22 (Capstone): Deploy your containerized Part 6 agent to a Kubernetes cluster with all production patterns applied
Lesson 23: Create a reusable skill for future Kubernetes deployment work

By the end, your agent runs on an orchestrated cluster with automatic scaling, self-healing, proper security, and rolling updates.

Looking Ahead

This chapter produces a deployed agent and a reusable Kubernetes deployment skill. Chapter 51 (Kafka for AI Events) builds on this foundation with event-driven architectures for AI agent communication.

What You'll Learn​

Chapter Structure​

4-Layer Teaching Progression​

Prerequisites​

Your Part 6 Agent: The Thread Through This Chapter​

Looking Ahead​