The Server
Your request arrives at its destination
After traversing the entire stack, your request hits the origin — load balancer → reverse proxy → application server → database → business logic → response.
How It Works
The server side is where your request finally produces a result. In a typical production setup, the request first hits a load balancer (AWS ALB, Cloudflare, HAProxy) that distributes traffic across multiple server instances using algorithms like round-robin, least-connections, or consistent hashing. The load balancer may also terminate TLS, offloading encryption from your application servers.
Behind the load balancer, a reverse proxy (Nginx, Caddy, Envoy) routes the request to the correct application based on the URL path or hostname. Your application code runs in a runtime (Node.js, Go, Python, Java, Rust) — it parses the request, validates authentication, executes business logic, queries databases or caches, and assembles a response. The entire server-side processing should ideally complete in under 100 ms for API endpoints and under 200 ms for HTML page generation.
The Signal Flow
Key Concepts
Distributing requests across multiple servers for reliability and throughput. Layer 4 (TCP) balancers route by IP/port; Layer 7 (HTTP) balancers can route by URL path, headers, or cookies. Health checks remove failed servers automatically.
An intermediary server that accepts client connections and forwards them to backend applications. It handles TLS termination, request routing, rate limiting, caching static assets, and buffering slow clients.
Applications maintain pools of pre-established database connections. Opening a new PostgreSQL connection takes ~50 ms; reusing a pooled connection takes ~0.1 ms. Connection pools (PgBouncer, built-in) are essential for performance.
The time from the client sending the request to receiving the first byte of the response. It includes all server-side processing plus network latency. Target: under 200 ms for most web applications.
Deep Dive
The request lifecycle in a web framework
Most web frameworks process requests through a middleware pipeline. Each middleware (logging, auth, rate limiting, CORS) wraps the next, forming an onion-like structure. The request passes inward through each layer, hits the route handler (your business logic), and the response passes outward through the same layers in reverse. Error handling middleware catches exceptions and converts them to appropriate HTTP error responses. This pattern (Express, Koa, Gin, Axum, Django, Rails) is universal.
Database query patterns
The database is usually the bottleneck. SQL queries go through: parsing → planning (query optimizer examines indexes and statistics to choose the fastest execution plan) → execution → result serialization. An ORM (Prisma, Drizzle, SQLAlchemy) generates SQL from your code. The key performance lever is indexes: a query scanning 1M rows without an index takes seconds; the same query with an index takes milliseconds. N+1 queries (one query per list item) are the most common ORM performance antipattern.