DNS Load Balancing: How It Works

DNS load balancing uses multiple IP addresses for a single domain to distribute traffic.

DNS Load Balancing

DNS load balancing is a technique that distributes incoming network traffic across multiple servers by returning different IP addresses in response to DNS queries. It is one of the oldest and most widely used methods for scaling web services, improving availability, and reducing the impact of individual server failures, all without requiring any special software on the client side.

What Is DNS Load Balancing

DNS load balancing works by configuring multiple DNS records for the same domain name, each pointing to a different server. When a client queries the DNS system for a domain, the DNS resolver returns one or more of these IP addresses. Different clients receive different addresses, spreading their connections across the available pool of servers rather than sending all traffic to a single machine.

The simplest form of DNS load balancing is round-robin DNS, where the DNS server cycles through a list of IP addresses and returns them in rotation. The first client receives the first IP, the second client receives the second IP, the third receives the third, and the cycle repeats. This distributes requests evenly across the server pool without any knowledge of how busy each server is or whether any of them are healthy.

DNS load balancing differs from hardware and software load balancers in a fundamental way. A traditional load balancer sits in the network path and actively proxies every connection through itself, making intelligent routing decisions in real time based on server health, active connection counts, and response times. DNS load balancing operates one step earlier, at the name resolution stage, before any connection to a server is established. Once a client has resolved a domain to an IP address, its connection goes directly to that server without any intermediary involved in the ongoing exchange.

How DNS Load Balancing Works

The mechanics of DNS load balancing rely on the fact that a single domain name can have multiple A records, each pointing to a different IP address. When a resolver queries the authoritative nameserver for the domain, the server returns all or a subset of these addresses. The client then connects to one of the returned addresses, typically the first one listed.

Multiple A records are configured for the same domain, each pointing to a different server IP address
A client device queries its DNS resolver for the domain name
The resolver queries the authoritative nameserver and receives multiple IP addresses in the response
The DNS server may reorder the list of addresses based on the configured distribution method before returning them
The client receives the response and typically connects to the first IP address in the list
The next client that queries the same domain receives the addresses in a different order, so it connects to a different server
Over many requests from many clients, traffic is distributed across all IP addresses in the record set

Multiple A records for DNS load balancing:

techyall.com.    300    IN    A    203.0.113.10
techyall.com.    300    IN    A    203.0.113.11
techyall.com.    300    IN    A    203.0.113.12

; Client 1 receives: 203.0.113.10, 203.0.113.11, 203.0.113.12
; Client 2 receives: 203.0.113.11, 203.0.113.12, 203.0.113.10
; Client 3 receives: 203.0.113.12, 203.0.113.10, 203.0.113.11

The TTL value on these records is critically important in DNS load balancing. A short TTL such as 30 to 300 seconds means clients re-query DNS frequently, which allows the DNS server to redistribute traffic more dynamically and respond quickly to server failures by removing unhealthy addresses from the response. A long TTL means clients cache the address for longer, reducing DNS query volume but making it slower to redirect traffic away from a failed server.

DNS Load Balancing Methods

Several different distribution strategies can be applied at the DNS layer, ranging from simple rotation through geographically aware routing to health-checked failover. Modern DNS load balancing services support multiple methods that can be combined to build sophisticated traffic management policies.

Method	How It Works	Best For	Limitation
Round Robin	Cycles through IP addresses in order, returning each one in rotation to successive clients	Simple even distribution across identical servers	No awareness of server health or load, sends traffic to unhealthy servers
Weighted Round Robin	Assigns a weight to each server so higher-capacity servers receive a proportionally larger share of traffic	Mixed server pools where some machines have more capacity than others	Still no health awareness, weights must be set manually
Geolocation Routing	Returns the IP address of the server geographically closest to the client based on the resolver's location	Global services that want to serve users from nearby infrastructure for lower latency	Resolver location may not match user location, especially with public resolvers
Latency-Based Routing	Returns the IP address of the server that has the lowest measured latency to the querying resolver	Performance-sensitive global services where minimising response time is critical	Requires ongoing latency measurements between servers and resolver locations
Failover Routing	Returns a primary server address normally and falls back to a secondary address when health checks detect the primary is down	High availability setups where a backup server should take over automatically	Failover speed is limited by TTL duration and DNS propagation time
Health-Checked Round Robin	Combines round-robin distribution with active health checks that remove unhealthy servers from the rotation	Production services that need both even distribution and automatic removal of failed servers	More complex to configure, requires a managed DNS service with health check support

Round Robin DNS in Detail

Round robin DNS is the simplest and most widely deployed form of DNS load balancing. It requires no special DNS software beyond the ability to add multiple A records for the same hostname. The DNS server returns the full list of IP addresses with each query but rotates the order so different clients receive the same addresses in a different sequence.

Because most clients connect to the first IP address in the list they receive, rotating the order effectively distributes connections across all servers over time. Over a large number of clients and queries, round robin DNS produces a roughly even distribution, assuming all clients behave the same way and the TTL is short enough that caching does not concentrate too many clients on the same address for extended periods.

The key weakness of round robin DNS is its complete lack of health awareness. If one of the servers in the pool goes down, the DNS server continues returning its IP address in the rotation. Clients that receive that address will attempt to connect and fail, experiencing errors or timeouts until their cached address expires and they receive a different one on the next query. This makes plain round robin DNS unsuitable for any service where availability is critical, unless combined with health checks at the DNS layer or with a separate monitoring system that updates the DNS records when failures are detected.

Geolocation and Latency-Based Routing

For globally distributed services serving users across multiple continents, geolocation and latency-based routing extend DNS load balancing beyond simple distribution into intelligent traffic steering. Instead of distributing traffic evenly across all servers regardless of where the user is located, these methods route each user to the server or data centre that is most appropriate for their geographic location.

Geolocation routing works by mapping the IP address of the DNS resolver making the query to a geographic location. If the resolver is in Europe, the DNS server returns the IP address of the European data centre. If the resolver is in Asia, it returns the Asian data centre's address. This reduces latency for users by ensuring they connect to a server that is physically nearby, minimising the number of network hops and the speed-of-light delay inherent in long-distance connections.

A limitation of geolocation-based routing is that the DNS server sees the resolver's IP address rather than the end user's IP address. When a user in Tokyo uses a public resolver like Google's 8.8.8.8, which is located in various Google data centres around the world, the resolver's IP may map to a location that does not match the user's actual location. The EDNS Client Subnet extension addresses this by allowing resolvers to include a portion of the client's IP address in DNS queries, giving the authoritative server better information about where the actual user is located. Major DNS providers and resolvers support this extension, though it introduces privacy trade-offs.

DNS Failover

DNS failover is a specific application of DNS load balancing focused on high availability rather than even traffic distribution. In a failover configuration, a primary server handles all traffic under normal conditions. A health monitoring system continuously checks whether the primary server is responding correctly. If the health check fails, the DNS record is automatically updated to remove the primary server's IP address and return only the secondary server's address, redirecting all new client connections to the backup.

The speed at which DNS failover takes effect is constrained by TTL values and DNS propagation. When the primary server fails and the DNS record is updated, clients that have already cached the old record will continue attempting to reach the failed server until their cached record expires. A short TTL of 60 to 300 seconds means most clients will receive the updated record quickly after a failure. A long TTL of 3600 seconds or more means clients may continue attempting the failed server for up to an hour before their cache expires and they query DNS again.

This TTL-based delay means DNS failover is not suitable for applications that require near-instant failover in seconds. Hardware load balancers and anycast routing provide faster failover at the cost of more infrastructure. DNS failover is well-suited to scenarios where a few minutes of recovery time is acceptable, such as recovering from a data centre outage by redirecting traffic to a backup region.

DNS Load Balancing vs Hardware Load Balancers

DNS load balancing and hardware or software load balancers solve overlapping problems but operate differently and are suited to different scenarios. Understanding the trade-offs helps when deciding which approach fits a given architecture.

Feature	DNS Load Balancing	Hardware / Software Load Balancer
Where it operates	At the DNS resolution stage, before any connection is made	In the network path, actively proxying every connection
Health awareness	Basic with managed DNS health checks, none with plain round robin	Real-time, removes unhealthy backends immediately
Failover speed	Limited by TTL and DNS propagation, typically minutes	Near-instant, typically sub-second
Session persistence	Not possible, clients may switch servers between requests	Sticky sessions can pin a client to one backend
Geographic routing	Native, routes users to the nearest data centre	Requires anycast or global server load balancing solutions
Cost and complexity	Low, often included in managed DNS services	Higher, requires dedicated infrastructure or cloud load balancer
Scalability	Easily distributes traffic across global infrastructure	Can become a bottleneck without sufficient capacity
Visibility into traffic	None, connections go directly to backend servers	Full, can inspect, modify, and log all traffic

In practice, DNS load balancing and hardware load balancers are frequently used together rather than as alternatives. DNS load balancing distributes traffic across multiple data centres or regions at the global level. Within each data centre or region, a hardware or software load balancer distributes traffic across the individual server pool. This two-tier approach combines the geographic distribution and simplicity of DNS load balancing with the real-time health awareness and session management of a dedicated load balancer.

DNS Load Balancing with Anycast

Anycast is a network routing technique that assigns the same IP address to multiple servers in different locations. When a client sends a packet to an anycast address, the network automatically routes it to the topologically nearest server advertising that address. Anycast is not DNS load balancing itself, but it is closely related and often used alongside it to achieve fast global routing and inherent failover.

Cloudflare's 1.1.1.1 DNS resolver and Google's 8.8.8.8 are both anycast addresses. They are served by hundreds of servers in data centres around the world, all advertising the same IP address. A user in Sydney is automatically routed to a nearby Cloudflare or Google data centre without any DNS trickery, because the network routing infrastructure itself handles the geographic distribution based on BGP route announcements.

Anycast provides extremely fast failover because it operates at the routing layer rather than the DNS layer. If a data centre goes offline, BGP route announcements from that location stop, and traffic automatically flows to the next nearest location within seconds. This is considerably faster than DNS-based failover which is constrained by TTL expiry. For latency-sensitive services that need global distribution, combining anycast IP addresses with DNS load balancing provides both fast geographic routing and flexible traffic management.

Managed DNS Services for Load Balancing

Implementing DNS load balancing beyond simple round robin requires a managed DNS provider that supports health checks, geographic routing, weighted distribution, and failover policies. Several major providers offer these capabilities as part of their DNS products.

Amazon Route 53 is one of the most feature-rich DNS load balancing services available, offering health checks, latency-based routing, geolocation routing, weighted routing, and failover routing as distinct routing policies that can be combined in complex configurations. Route 53 integrates deeply with other AWS services, making it a natural choice for infrastructure hosted on AWS. Cloudflare's DNS service offers load balancing as a separate product with health checks, geographic steering, and traffic weighting through a clean interface. NS1 and Dyn offer advanced traffic management with fine-grained control over routing policies, particularly for enterprise use cases with complex requirements.

Frequently Asked Questions

Does DNS load balancing guarantee even traffic distribution?
No. DNS load balancing provides approximate distribution across servers but cannot guarantee perfectly even traffic. Several factors disrupt even distribution. Resolvers cache DNS responses and serve many clients from a single cached answer, meaning all clients behind that resolver connect to the same server until the cache expires. Some clients ignore the order of returned IP addresses. Long TTL values cause sustained concentration of traffic on specific servers. For precise even distribution with real-time balancing, a dedicated load balancer operating at the connection level is required in addition to or instead of DNS-only balancing.
What happens when a server goes down with plain round robin DNS?
With plain round robin DNS and no health checking, the DNS server continues returning the failed server's IP address in the rotation. Clients that receive that address will attempt to connect and experience connection failures or timeouts. They will only stop receiving the failed address when it is manually removed from the DNS records or when a monitoring system detects the failure and removes it automatically. This is the primary limitation of plain round robin DNS and the main reason managed DNS services with built-in health checks are preferred for production services.
Why do TTL values matter so much for DNS load balancing?
TTL controls how long clients and resolvers cache a DNS response before querying again. In DNS load balancing, short TTLs allow faster redistribution of traffic and quicker failover when a server fails, because clients re-query DNS more frequently and pick up changes sooner. Long TTLs reduce DNS query volume and improve performance but make the system slower to respond to changes. For load-balanced services, TTLs of 30 to 300 seconds are common, trading some additional DNS query load for faster reaction to changes. The TTL is the primary lever for balancing responsiveness against DNS infrastructure load.
Can DNS load balancing maintain user sessions across requests?
Not reliably. Because DNS caching means a client may resolve to the same server for the duration of the TTL, there is some natural session affinity within a caching window. However, this is not a reliable session persistence mechanism. After the TTL expires, the client may receive a different IP address and connect to a different server, losing any session state stored locally on the previous server. Applications that require session persistence across requests should store session state in a shared external store such as a database or cache, or use a load balancer that supports sticky sessions, rather than relying on DNS to route the same client to the same server consistently.
What is the difference between DNS load balancing and a CDN?
A CDN, or Content Delivery Network, uses DNS load balancing as one of its core mechanisms to route users to the nearest edge server. When you query a CDN-backed domain, the DNS response directs you to a nearby edge location using geolocation or latency-based routing. However, a CDN goes far beyond DNS load balancing by also caching content at edge locations, terminating TLS connections at the edge, compressing responses, and protecting against DDoS attacks. DNS load balancing is a component of how CDNs work, but a CDN is a complete content delivery platform rather than just a DNS routing mechanism.
Which is better for high availability: DNS failover or a load balancer?
For the fastest possible failover, a dedicated load balancer is better. Load balancers monitor backend health continuously and remove failed servers from rotation within seconds. DNS failover is limited by TTL expiry, meaning clients may continue attempting a failed server for minutes after the failure is detected. For most applications where a few minutes of recovery time is acceptable and geographic distribution is needed, DNS failover is cost-effective and operationally simple. For applications where near-instant failover is critical, a load balancer within each region combined with DNS failover between regions provides both fast local failover and geographic resilience.

Conclusion

DNS load balancing is a foundational technique for distributing traffic across multiple servers and data centres at global scale, using the DNS resolution process itself as the traffic steering mechanism. From simple round robin distribution across identical servers to sophisticated geolocation routing, latency-based steering, and health-checked failover, DNS load balancing provides a flexible and low-cost layer of traffic management that operates before any connection to a server is established. Its limitations around session persistence, failover speed, and distribution precision mean it works best as one layer in a multi-tier architecture, typically combined with dedicated load balancers within each region for real-time health management and connection-level control. Understanding how TTL values, health checks, and routing policies interact gives you the knowledge to design DNS load balancing configurations that are both performant and resilient. To go deeper, explore load balancing, the DNS lookup process, nameservers, and CDN.

DNS Load Balancing: How It Works