Packet Loss and Retransmission: How TCP Handles It
Packet loss occurs when data packets fail to reach their destination. TCP automatically detects and retransmits lost packets to ensure complete and accurate data delivery.
Packet Loss and TCP Retransmission
Packet loss happens when data travelling across a network never arrives at its destination. TCP has built-in mechanisms to detect this and automatically resend missing data, ensuring nothing is permanently lost. Understanding how these mechanisms work explains a great deal about why network performance degrades under congestion and why some protocols handle loss better than others.
What Is Packet Loss
When data travels from one device to another across a network, it is broken into small units called packets. Each packet travels independently through the network, potentially taking different routes before being reassembled at the destination. If a packet is dropped by a congested router, corrupted in transit, or simply never arrives, that is packet loss.
Packet loss is measured as a percentage of the total packets sent. Mild packet loss under 1% is barely noticeable in most applications because protocols like TCP are designed to detect and recover from it automatically. High packet loss at 5% or more causes visible and frustrating problems including slow page load times, choppy video calls, broken audio, and significant lag in online games where timing is critical.
It is important to understand that packet loss is not always a sign of a broken network. Some level of loss is a normal and expected part of how the internet manages congestion. Routers intentionally drop packets when their queues fill up as a signal to senders to slow down. The protocols built on top of the network layer, particularly TCP, are designed with this reality in mind and handle it gracefully.
Common Causes of Packet Loss
Packet loss can originate at many different points along the path between sender and receiver. Identifying the cause requires narrowing down where in the network the loss is occurring, which tools like traceroute and ping can help with.
| Cause | What Happens |
|---|---|
| Network Congestion | Router queues fill beyond capacity and excess packets are deliberately dropped to manage load and signal senders to slow down |
| Hardware Failure | Faulty cables, worn network switches, or malfunctioning network interface cards silently drop packets without any error signal |
| Wireless Interference | Wi-Fi signals disrupted by physical obstacles, distance from the access point, or competing signals from nearby devices and networks |
| Software Bugs | Driver or firmware issues in network hardware cause packets to be dropped or corrupted before they are transmitted |
| Firewall Rules | Misconfigured firewalls silently discard packets matching a rule without sending any notification to the sender |
| TTL Expiry | Every packet carries a Time-to-Live counter that decrements at each hop. Packets that reach zero are dropped by routers to prevent them looping indefinitely |
| Buffer Bloat | Oversized router buffers cause packets to queue for so long that they effectively arrive too late to be useful, which upper-layer protocols treat similarly to loss |
How TCP Detects Packet Loss
TCP is a reliable transport protocol, meaning it guarantees that data sent from one end will be received correctly and in order at the other end. To provide this guarantee, TCP needs a way to detect when something goes wrong. It uses two complementary mechanisms for this.
- Acknowledgements (ACKs): Every time the receiver successfully receives a segment of data, it sends an acknowledgement back to the sender. The acknowledgement contains a sequence number telling the sender how much data has been received so far. If the sender does not receive an ACK within a calculated timeout period called the Retransmission Timeout (RTO), it assumes the packet was lost and resends it.
- Duplicate ACKs: If packets arrive out of order, the receiver sends a duplicate ACK for the last segment it received in sequence, signalling that there is a gap. Receiving three duplicate ACKs is a strong signal that a specific segment was lost. Rather than waiting for the full timeout to expire, TCP uses this signal to trigger fast retransmission of the missing segment immediately.
The Retransmission Timeout is not a fixed value. TCP calculates it dynamically based on the measured round-trip time between the sender and receiver, adjusting it continuously as network conditions change. On a low-latency local network the RTO might be a few milliseconds. On a high-latency satellite connection it could be several seconds. This adaptive calculation helps TCP behave appropriately across a wide range of network conditions.
How TCP Retransmission Works
TCP supports several different retransmission strategies depending on how loss is detected and how much information is available about which specific segments are missing.
| Mechanism | How It Triggers | Speed |
|---|---|---|
| Timeout Retransmission | No ACK is received within the Retransmission Timeout period after a segment is sent | Slow, must wait for the full RTO to expire before retransmitting |
| Fast Retransmission | Three duplicate ACKs are received, indicating a specific segment is missing | Fast, retransmits the missing segment immediately without waiting for the timeout |
| Selective ACK (SACK) | The receiver tells the sender exactly which segments it has received and which are still missing | Efficient, only the specific missing segments are resent rather than everything from the lost segment onward |
SACK, or Selective Acknowledgement, is a TCP extension that is supported by most modern operating systems and network stacks. Without SACK, when a loss is detected TCP must conservatively resend all data from the lost segment forward, because it has no way of knowing which later segments the receiver may have already buffered. With SACK, the receiver can precisely report gaps in the data it has received, allowing the sender to retransmit only what is actually missing and avoid sending data the receiver already has.
TCP Congestion Control
When TCP detects packet loss, it does not simply retransmit the missing data and continue at the same rate. It also reduces its sending rate to avoid making congestion worse. This behaviour is called congestion control, and it is one of the most important features that keeps the internet functioning under heavy load.
The core idea is that packet loss is treated as a signal of congestion. If packets are being dropped, it means some router along the path is overwhelmed. Continuing to send at the same rate would add to the problem and make recovery slower. By reducing the sending rate when loss is detected, TCP gives the network a chance to recover.
- Slow Start: When a new connection is established or after a timeout retransmission, TCP begins with a small congestion window and doubles it with every round-trip until either loss is detected or a threshold is reached. Despite the name, the exponential growth means capacity is found quickly in favourable conditions.
- Congestion Avoidance: Once the slow start threshold is reached, TCP switches to linear growth, increasing the window by one segment per round-trip. This more conservative growth continues until loss is detected again.
- Fast Recovery: After a fast retransmit triggered by duplicate ACKs, TCP reduces its window but does not reset all the way back to slow start. Instead it enters fast recovery, allowing it to continue sending new data while waiting for the retransmitted segment to be acknowledged.
Impact on Web Performance
Even a small amount of packet loss can have a disproportionately large effect on web performance because of how TCP's recovery mechanisms interact with the protocols built on top of it.
- A timeout retransmission pauses data delivery for the entire duration of the RTO, which can be several hundred milliseconds or more on high-latency connections
- Each loss event triggers congestion control, which cuts the sending rate and requires many subsequent round-trips to build it back up to where it was
- Head-of-line blocking in HTTP/1.1 means a single lost packet stalls every other request waiting behind it in the connection queue
- HTTP/2 multiplexes multiple streams over a single TCP connection, which improves many things but makes head-of-line blocking worse when loss occurs, because a single lost packet freezes all streams on that connection simultaneously
- HTTP/3 solves this by running over QUIC, which is built on UDP and implements its own per-stream retransmission. A lost packet only blocks the stream it belongs to, leaving all other streams unaffected
How to Measure Packet Loss
The simplest way to test for packet loss is with the ping command, which sends a series of small packets to a target host and reports how many came back. Running it with a large number of packets gives a more statistically reliable measurement than the default four or five packets.
ping -n 50 google.com
# Look for "X% packet loss" in the summary
If ping shows no loss but you still suspect a problem on a specific path, traceroute (or tracert on Windows) can help identify which hop along the route is dropping packets. Running ping against each hop individually can help isolate whether loss is occurring at a specific router or further down the path. Keep in mind that some routers intentionally deprioritise ping responses and may show apparent loss even when real traffic passes through them cleanly.
Frequently Asked Questions
- Does UDP retransmit lost packets?
No. UDP is a connectionless protocol with no built-in reliability mechanisms. Packets that are lost are gone permanently as far as UDP is concerned. Applications that use UDP and need reliability must implement their own loss detection and retransmission logic at the application layer. This is exactly what QUIC does, giving it the benefits of UDP's flexibility while adding reliability on a per-stream basis. - What is an acceptable packet loss rate?
Under 1% is generally acceptable for most applications, with TCP handling recovery transparently. Voice and video calls begin to degrade noticeably above 2 to 3% because real-time audio and video cannot wait for retransmitted packets and must instead conceal the gaps. File transfers using TCP can tolerate higher loss rates without corrupting data, but each loss event slows the transfer and extends completion time. - Can packet loss be caused by ISP throttling?
Yes. ISPs can shape or throttle specific types of traffic, which may result in packet loss for certain destinations or protocols. This is sometimes visible as loss that appears only for specific applications or at specific times of day. A VPN can sometimes reveal whether throttling is the cause, because it changes how the traffic appears to the ISP's classification systems. - Why does packet loss feel worse on video calls than on file downloads?
File downloads use TCP, which retransmits lost packets automatically. The download takes longer but completes correctly. Video calls typically run over UDP-based protocols where retransmission is not possible within the time constraints of real-time communication. A lost packet means a missing audio or video frame, which is directly audible or visible as a glitch. The real-time nature of the application is what makes loss feel more severe. - What is the difference between packet loss and high latency?
Latency is the time it takes for a packet to travel from sender to receiver. High latency makes everything feel slow but data still arrives intact. Packet loss means data does not arrive at all and must be retransmitted. Both affect performance but in different ways. High latency increases the time it takes for each round-trip, which slows TCP's ability to build up its sending window. Packet loss triggers retransmission and congestion control, which cuts throughput directly. A connection can have high latency and no loss, low latency and significant loss, or both simultaneously.
Conclusion
Packet loss is a normal and expected part of how networks operate under load. TCP's retransmission and congestion control mechanisms handle it automatically, detecting missing segments through acknowledgements and duplicate ACKs, resending what was lost, and reducing the sending rate to give the network room to recover. The cost of this reliability is speed, particularly when timeout retransmissions are involved or when congestion control cuts throughput significantly after a loss event. Understanding how these mechanisms work helps diagnose slow connections, explains why protocols like HTTP/3 over QUIC were designed to reduce the impact of loss at the transport layer, and gives context for why even small amounts of packet loss can have outsized effects on web performance. See also TCP handshake and how routing works.
