eBPF Networking: High-Performance Kernel Networking
eBPF (Extended Berkeley Packet Filter) is a technology that allows running sandboxed programs in the Linux kernel without changing kernel source or loading modules. It enables programmable packet processing, high-performance observability, networking, and security with minimal overhead.
eBPF Networking: High-Performance Kernel Networking
eBPF (Extended Berkeley Packet Filter) is a revolutionary technology that allows running sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules. Originally designed for packet filtering (tcpdump), eBPF has evolved into a general-purpose execution engine enabling programmable packet processing, high-performance observability, networking, security, and application tracing. eBPF programs attach to kernel events (network packets, system calls, function entry/exit) and execute with near-native performance thanks to just-in-time compilation.
To understand eBPF networking properly, it helps to be familiar with Linux kernel fundamentals, network protocols, and observability concepts.
┌─────────────────────────────────────────────────────────────────────────┐
│ eBPF Networking Architecture │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Userspace Kernel Space │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ eBPF Program │ │ eBPF Verifier │ │
│ │ Source Code │ │ (Safety checks) │ │
│ │ (C/Rust) │ └──────────┬──────────┘ │
│ └────────┬────────┘ │ │
│ │ ▼ │
│ ┌────────▼────────┐ ┌─────────────────────┐ │
│ │ LLVM/Clang │ │ JIT Compiler │ │
│ │ (Compile to │ │ (Native machine │ │
│ │ eBPF bytecode)│ │ code) │ │
│ └────────┬────────┘ └──────────┬──────────┘ │
│ │ │ │
│ ┌────────▼────────┐ ┌──────────▼──────────┐ │
│ │ bpf() syscall │─────────────────────────┤ eBPF Program │ │
│ │ (Load program) │ │ (Running in │ │
│ └─────────────────┘ │ kernel context) │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ Attachment Point │ │
│ │ (XDP, TC, Socket, │ │
│ │ Tracepoint, etc) │ │
│ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
What Is eBPF?
eBPF is a technology that allows running sandboxed programs in the Linux kernel safely and efficiently. It extends the original Berkeley Packet Filter (BPF) used in tcpdump. eBPF programs are event-driven—they run when specific kernel events occur: network packets arriving, system calls being invoked, or kernel functions being entered/exited. eBPF includes a verifier ensuring programs are safe (no infinite loops, no out-of-bounds memory access, no kernel crashes) before loading into kernel. JIT compiler converts eBPF bytecode to native machine code for near-native performance.
- Event-Driven: Programs attach to kernel hooks, run when event occurs, and are removed when no longer needed.
- Sandboxed: Verifier rejects unsafe programs, preventing kernel crashes or security violations.
- Efficient: JIT compilation to native code, in-kernel execution (no userspace context switch), maps for data sharing between eBPF and userspace.
- Programmable: Extensible without kernel changes or reboots, safe hot-loading of new functionality.
Why eBPF Matters for Networking
Traditional kernel networking is fixed and hard to extend. eBPF enables programmable packet processing at native kernel speed without custom kernel modules (kernel modules risk system stability).
- Performance: eBPF programs run at native speed (JIT-compiled), bypass userspace for packet processing, and eliminate context switches. XDP (eXpress Data Path) processes packets at earliest point in NIC driver (before kernel network stack).
- Programmability: Custom packet processing logic: load balancing, firewall, tunneling encap/decap, protocol parsing, and header modification.
- Observability: Kernel-level metrics without instrumentation overhead: packet drops, latency, errors, custom counters. No application modification required.
- Safety: Verifier ensures memory safety, loop safety, and stack safety. Cannot crash kernel, safe to run in production.
- Dynamic Updates: Attach/detach eBPF programs without restarting services or rebooting. Replace running programs atomically, and rollback if needed.
Hook Point Layer Description
─────────────────────────────────────────────────────────────────────────────
XDP (eXpress Data Path) NIC driver Earliest packet processing (before skb)
TC (Traffic Control) Network Packet after/ before routing
Socket Filter Transport Per-socket packet filtering
Sockops Socket Socket operation hooks
Cgroup Cgroup Network policy per cgroup
kprobe/tracepoint Kernel Kernel function entry/exit (generic)
eBPF Networking Hook Points
XDP (eXpress Data Path)
XDP is the highest performance eBPF hook, executing as early as NIC driver layer before kernel network stack allocates socket buffer (skb). XDP programs process raw packets and return actions: XDP_DROP (discard packet without processing), XDP_PASS (continue to kernel network stack), XDP_TX (transmit back out same NIC), XDP_REDIRECT (redirect to another NIC or CPU).
- Use Cases: DDoS mitigation (drop attack packets at line rate), load balancing (direct packets to backend servers via XDP_REDIRECT), custom packet filtering, early drop of invalid packets.
- Performance: Up to 10-20 million packets per second per core, bypassing kernel stack overhead, minimal CPU usage.
TC (Traffic Control) Ingress/Egress
TC hooks run later in kernel network stack after packet has been converted to skb (socket buffer). More features than XDP (can modify packet size, tunnel encap), slower than XDP but still fast.
- Use Cases: Traffic shaping (rate limiting), tunneling (VXLAN, Geneve encap/decap), header modification (NAT, rewriting).
Socket Filters
Classic BPF filters for per-socket filtering (what tcpdump uses). Extended eBPF supports cgroup socket filters for container network policies, socket lookup for load balancing decisions, and sockops for socket operation hooks.
Cgroup Hooks
eBPF attaches to cgroups (container groups) for network policy enforcement including per-cgroup firewall, egress bandwidth limiting, and connect/ sendmsg/recvmsg hooks.
Action Description Use Case
─────────────────────────────────────────────────────────────────────────────
XDP_DROP Drop packet (no further processing) DDoS mitigation, filtering
XDP_PASS Continue to kernel network stack Legitimate traffic
XDP_TX Transmit back out same NIC Packet reflector
XDP_REDIRECT Redirect to another NIC or CPU Load balancing
XDP_ABORTED Program error (drop, but error logged) Debugging
Example DDoS defense: XDP program counting packets per IP,
dropping any exceeding rate limit (XDP_DROP), passing others (XDP_PASS).
eBPF Data Structures (Maps)
eBPF maps are key-value data structures shared between eBPF programs and userspace, enabling stateful packet processing, metrics aggregation, and configuration dynamic updates.
| Map Type | Description | Use Case |
|---|---|---|
| Hash Table | Generic key-value store | Flow tables, connection tracking |
| Array | Integer-indexed, fixed size | Configuration arrays, statistics |
| Per-CPU Hash/Array | Each CPU has own copy | Performance counters without locking |
| LRU Hash | Least Recently Used eviction | Connection tracking (finite table) |
| Stack/Queue | LIFO/FIFO for data passing | Message passing |
| Ring Buffer | Efficient data streaming | |
eBPF Networking Tools and Frameworks
Cilium
Cilium is the most popular eBPF-based networking solution for Kubernetes. It provides eBPF-powered CNI, transparent encryption, service load balancing, network policy enforcement (L3-L7), and observability (Hubble). Cilium replaces kube-proxy for service routing and outperforms iptables. Covered in Cilium guide.
Calico eBPF
Calico added eBPF dataplane as alternative to iptables. Benefits include lower latency, higher throughput, and better scalability for large clusters.
Falco
Runtime security tool using eBPF for kernel-level event monitoring. Detects anomalous process execution, file system changes, and network connections.
Katran
Facebook's eBPF-based load balancer (used at edge). XDP-based, high performance (millions of packets per second), consistent hashing for load distribution.
Tool Primary Use eBPF Hooks Used
─────────────────────────────────────────────────────────────────────────────
Cilium Kubernetes CNI, security, LB XDP, TC, Socket, Cgroup
Calico eBPF Kubernetes networking (iptables replacement) XDP, TC
Falco Runtime security Tracepoints, kprobes
Katran L4 load balancer XDP
Pixie Kubernetes observability kprobes, uprobes
Hubble Kubernetes network observability eBPF (via Cilium)
Merbridge Service mesh acceleration Sockops, redir
eBPF Networking Use Cases
Kubernetes Networking (CNI)
eBPF replaces iptables for service load balancing, pod-to-pod encryption, network policies (L3-L7 including HTTP, gRPC, Kafka), and observability with Hubble for flow logs, latency metrics, and service dependency graphs.
DDoS Mitigation
XDP programs drop attack packets at line rate before kernel stack, using rate limiting per IP address, SYN flood protection (tracking half-open connections), and dropping invalid packets.
Network Observability
eBPF gathers network metrics with low overhead: packet drops (where, why, which packet), latency per connection (trace TCP RTT), TCP retransmits and out-of-order packets, and per-application network traffic.
Service Load Balancing
eBPF load balancers (Katran, Cilium) support consistent hashing (keep client-to-backend affinity), direct server return (bypass load balancer for return traffic), and health checking with automatic backend removal.
Network Security
eBPF enforces drop traffic based on packet signatures (XDP), block container network access via cgroup hooks, rate limit per IP or per pod traffic (XDP or TC), and encrypt pod-to-pod traffic transparently (WireGuard in eBPF).
Technology Throughput (Mpps) Latency (ns) CPU Overhead
─────────────────────────────────────────────────────────────────────────────
Linux iptables 5 500 Moderate
Linux TC 8 400 Low
eBPF TC 15 200 Very Low
eBPF XDP 25 100 Minimal
DPDK 30 60 Minimal (kernel bypass)
Note: Numbers approximate, vary by hardware and configuration.
XDP is software-based (no kernel bypass), unlike DPDK requires application changes.
eBPF Networking Anti-Patterns
- Complex Logic in XDP: XDP has strict limitations: no loops (unless bounded), limited stack size (512 bytes), no kernel helper calls for many operations. Move complex processing to TC or higher layers.
- Not Considering Tail Calls: Single eBPF program limited to 1M instructions (may be insufficient for complex processing). Use tail calls to chain multiple programs.
- Ignoring Verifier Constraints: eBPF verifier rejects programs with unreachable code, invalid memory access, potential infinite loops. Write simple, verifier-friendly code. Test program loading before deployment.
- No Map Size Planning: Maps have fixed size at creation. Runaway map growth leads to ENOMEM failures and packet drops. Monitor map fullness, set max entries appropriately.
- Overusing Global Variables: eBPF global variables are per-program (not per-CPU), require synchronization for writes. Use per-CPU maps for counters instead.
- No Fallback Path: eBPF program may fail to load (unsupported kernel version, insufficient memory, verifier rejection). Design fallback path to traditional networking (iptables, standard stack).
Constraint Limit / Rule
─────────────────────────────────────────────────────────────────────────────
Program Size Max 1 million instructions
Stack Size Max 512 bytes
Loop Support Bounded loops only (since kernel 5.3)
Must have verifiable upper bound
Function Calls Limited to helper functions (no arbitrary kernel calls)
Pointer Access Must be bounds-checked
Arithmetic No division by variable (must be constant)
Backward Jumps Restricted (prevents infinite loops)
eBPF Networking Best Practices
- Start Simple: Begin with simple eBPF programs (drop specific port, count packets). Gradually add complexity after understanding verifier constraints.
- Use CO-RE (Compile Once, Run Everywhere): Kernel versions differ across hosts, making eBPF binary compatibility challenging. CO-RE uses BTF (BPF Type Format) for relocations. Libbpf and Cilium's ebpf Go library support CO-RE.
- Monitor eBPF Errors: Trace eBPF program errors: check trace_pipe (/sys/kernel/debug/tracing/trace_pipe) for verifier failures, monitor map ENOMEM events, watch for XDP program exceptions (XDP_ABORTED).
- Prefer XDP for Simple Fast Path, TC for Complex: XDP limitations: cannot modify packet size, limited helpers, no routing decision. TC supports more features but slower. Offload complexity to TC after XDP fast path.
- Use Per-CPU Maps for Counters: Per-CPU maps avoid atomic operations (no locking), higher throughput for statistics counters. Use BPF_MAP_TYPE_PERCPU_HASH or BPF_MAP_TYPE_PERCPU_ARRAY.
- Test eBPF Programs in Containers: Isolate eBPF development environment to avoid kernel crashes during testing. Use vagrant, libvirt, or Docker with privileged mode for eBPF program loading. Test across target kernel versions, especially after kernel upgrades.
- Leverage Existing Libraries: libbpf for C eBPF programs (low-level), cilium/ebpf for Go (userspace loader), bpftrace for quick scripts (not production), BCC for Python prototyping.
Tool Language Use Case
─────────────────────────────────────────────────────────────────────────────
libbpf C Low-level eBPF + CO-RE (production)
cilium/ebpf Go Loading eBPF from Go apps (userspace)
bpftrace awk-like Quick scripts for debugging/observability
BCC (bpf tools) Python/C Prototyping, existing tools (trace, execsnoop)
bpftool CLI Inspect loaded eBPF programs, maps, debugging
bpftrace one-liner: bpftrace -e 'kprobe:tcp_v4_connect { printf("connect %s\n", comm); }'
eBPF vs Traditional Networking
| Aspect | Traditional (iptables/TC) | eBPF Networking |
|---|---|---|
| Performance | Good | Excellent (near-native) |
| Programmability | Limited (fixed actions) | Full (custom logic) |
| Safety | Kernel modules risky | Verifier-enforced safe |
| Dynamic Updates | Slow (rule enumeration) | Atomic, fast |
| Observability | Limited logs | Rich metrics from eBPF |
| Moderate (iptables) | Steep (eBPF + verifier) |
Frequently Asked Questions
- What is the difference between eBPF and DPDK?
eBPF runs in kernel (sandboxed), safe but higher latency (microseconds), no application changes needed. DPDK bypasses kernel entirely, runs in userspace, lower latency (nanoseconds), extremely high throughput, but requires application rewrite to use DPDK, and bypasses kernel networking stack and security. - Does eBPF require kernel recompilation?
No. eBPF is built into modern Linux kernels (4.1+ basic, 5.x+ advanced features). eBPF programs load dynamically via bpf() syscall. Kernel recompilation or reboot is not required. - Is eBPF safe for production?
Yes, if program passes verifier. Verifier ensures memory safety, bound loops, no kernel crashes. Production eBPF is used by Facebook (load balancer, DDoS protection at scale), Google, Netflix, Cloudflare (DDoS mitigation). - What Linux kernel version do I need for eBPF?
Basic eBPF features: 4.1+, BPF maps, helper functions: 4.4+. XDP support: 4.8+. BTF (CO-RE): 5.2+. Full feature set: 5.8+. Use recent kernel (5.x) for best eBPF support. - Can eBPF replace iptables?
Yes, in many cases. eBPF provides higher performance, better scalability, and programmability. Cilium and Calico eBPF replace kube-proxy iptables. May migrate legacy iptables to eBPF for modern workloads. - What should I learn next after eBPF networking?
After mastering eBPF networking, explore Cilium for Kubernetes networking, eBPF for observability (Hubble, Pixie), XDP for high-performance packet processing, bpftrace for dynamic tracing, eBPF for runtime security (Falco, Tetragon), and Cilium Service Mesh.
