eBPF Networking: High-Performance Kernel Networking

eBPF (Extended Berkeley Packet Filter) is a technology that allows running sandboxed programs in the Linux kernel without changing kernel source or loading modules. It enables programmable packet processing, high-performance observability, networking, and security with minimal overhead.

eBPF Networking: High-Performance Kernel Networking

eBPF (Extended Berkeley Packet Filter) is a revolutionary technology that allows running sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules. Originally designed for packet filtering (tcpdump), eBPF has evolved into a general-purpose execution engine enabling programmable packet processing, high-performance observability, networking, security, and application tracing. eBPF programs attach to kernel events (network packets, system calls, function entry/exit) and execute with near-native performance thanks to just-in-time compilation.

To understand eBPF networking properly, it helps to be familiar with Linux kernel fundamentals, network protocols, and observability concepts.

eBPF networking architecture:
┌─────────────────────────────────────────────────────────────────────────┐
│                         eBPF Networking Architecture                      │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   Userspace                                   Kernel Space              │
│   ┌─────────────────┐                         ┌─────────────────────┐   │
│   │  eBPF Program   │                         │   eBPF Verifier     │   │
│   │  Source Code    │                         │   (Safety checks)   │   │
│   │  (C/Rust)       │                         └──────────┬──────────┘   │
│   └────────┬────────┘                                    │              │
│            │                                              ▼              │
│   ┌────────▼────────┐                         ┌─────────────────────┐   │
│   │  LLVM/Clang     │                         │   JIT Compiler      │   │
│   │  (Compile to    │                         │   (Native machine   │   │
│   │   eBPF bytecode)│                         │    code)            │   │
│   └────────┬────────┘                         └──────────┬──────────┘   │
│            │                                              │              │
│   ┌────────▼────────┐                         ┌──────────▼──────────┐   │
│   │  bpf() syscall  │─────────────────────────┤   eBPF Program      │   │
│   │  (Load program) │                         │   (Running in      │   │
│   └─────────────────┘                         │    kernel context)  │   │
│                                                └──────────┬──────────┘   │
│                                                           │              │
│                                                ┌──────────▼──────────┐   │
│                                                │  Attachment Point   │   │
│                                                │  (XDP, TC, Socket,  │   │
│                                                │   Tracepoint, etc)  │   │
│                                                └─────────────────────┘   │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

What Is eBPF?

eBPF is a technology that allows running sandboxed programs in the Linux kernel safely and efficiently. It extends the original Berkeley Packet Filter (BPF) used in tcpdump. eBPF programs are event-driven—they run when specific kernel events occur: network packets arriving, system calls being invoked, or kernel functions being entered/exited. eBPF includes a verifier ensuring programs are safe (no infinite loops, no out-of-bounds memory access, no kernel crashes) before loading into kernel. JIT compiler converts eBPF bytecode to native machine code for near-native performance.

  • Event-Driven: Programs attach to kernel hooks, run when event occurs, and are removed when no longer needed.
  • Sandboxed: Verifier rejects unsafe programs, preventing kernel crashes or security violations.
  • Efficient: JIT compilation to native code, in-kernel execution (no userspace context switch), maps for data sharing between eBPF and userspace.
  • Programmable: Extensible without kernel changes or reboots, safe hot-loading of new functionality.

Why eBPF Matters for Networking

Traditional kernel networking is fixed and hard to extend. eBPF enables programmable packet processing at native kernel speed without custom kernel modules (kernel modules risk system stability).

  • Performance: eBPF programs run at native speed (JIT-compiled), bypass userspace for packet processing, and eliminate context switches. XDP (eXpress Data Path) processes packets at earliest point in NIC driver (before kernel network stack).
  • Programmability: Custom packet processing logic: load balancing, firewall, tunneling encap/decap, protocol parsing, and header modification.
  • Observability: Kernel-level metrics without instrumentation overhead: packet drops, latency, errors, custom counters. No application modification required.
  • Safety: Verifier ensures memory safety, loop safety, and stack safety. Cannot crash kernel, safe to run in production.
  • Dynamic Updates: Attach/detach eBPF programs without restarting services or rebooting. Replace running programs atomically, and rollback if needed.
eBPF networking hook points:
Hook Point              Layer       Description
─────────────────────────────────────────────────────────────────────────────
XDP (eXpress Data Path)  NIC driver  Earliest packet processing (before skb)
TC (Traffic Control)     Network     Packet after/ before routing
Socket Filter            Transport   Per-socket packet filtering
Sockops                  Socket      Socket operation hooks
Cgroup                   Cgroup      Network policy per cgroup
kprobe/tracepoint        Kernel      Kernel function entry/exit (generic)

eBPF Networking Hook Points

XDP (eXpress Data Path)

XDP is the highest performance eBPF hook, executing as early as NIC driver layer before kernel network stack allocates socket buffer (skb). XDP programs process raw packets and return actions: XDP_DROP (discard packet without processing), XDP_PASS (continue to kernel network stack), XDP_TX (transmit back out same NIC), XDP_REDIRECT (redirect to another NIC or CPU).

  • Use Cases: DDoS mitigation (drop attack packets at line rate), load balancing (direct packets to backend servers via XDP_REDIRECT), custom packet filtering, early drop of invalid packets.
  • Performance: Up to 10-20 million packets per second per core, bypassing kernel stack overhead, minimal CPU usage.

TC (Traffic Control) Ingress/Egress

TC hooks run later in kernel network stack after packet has been converted to skb (socket buffer). More features than XDP (can modify packet size, tunnel encap), slower than XDP but still fast.

  • Use Cases: Traffic shaping (rate limiting), tunneling (VXLAN, Geneve encap/decap), header modification (NAT, rewriting).

Socket Filters

Classic BPF filters for per-socket filtering (what tcpdump uses). Extended eBPF supports cgroup socket filters for container network policies, socket lookup for load balancing decisions, and sockops for socket operation hooks.

Cgroup Hooks

eBPF attaches to cgroups (container groups) for network policy enforcement including per-cgroup firewall, egress bandwidth limiting, and connect/ sendmsg/recvmsg hooks.

XDP return codes:
Action          Description                              Use Case
─────────────────────────────────────────────────────────────────────────────
XDP_DROP        Drop packet (no further processing)       DDoS mitigation, filtering
XDP_PASS        Continue to kernel network stack          Legitimate traffic
XDP_TX          Transmit back out same NIC                Packet reflector
XDP_REDIRECT    Redirect to another NIC or CPU            Load balancing
XDP_ABORTED     Program error (drop, but error logged)    Debugging

Example DDoS defense: XDP program counting packets per IP,
dropping any exceeding rate limit (XDP_DROP), passing others (XDP_PASS).

eBPF Data Structures (Maps)

eBPF maps are key-value data structures shared between eBPF programs and userspace, enabling stateful packet processing, metrics aggregation, and configuration dynamic updates.

Logging, events to userspace
Map Type Description Use Case
Hash Table Generic key-value store Flow tables, connection tracking
Array Integer-indexed, fixed size Configuration arrays, statistics
Per-CPU Hash/Array Each CPU has own copy Performance counters without locking
LRU Hash Least Recently Used eviction Connection tracking (finite table)
Stack/Queue LIFO/FIFO for data passing Message passing
Ring Buffer Efficient data streaming

eBPF Networking Tools and Frameworks

Cilium

Cilium is the most popular eBPF-based networking solution for Kubernetes. It provides eBPF-powered CNI, transparent encryption, service load balancing, network policy enforcement (L3-L7), and observability (Hubble). Cilium replaces kube-proxy for service routing and outperforms iptables. Covered in Cilium guide.

Calico eBPF

Calico added eBPF dataplane as alternative to iptables. Benefits include lower latency, higher throughput, and better scalability for large clusters.

Falco

Runtime security tool using eBPF for kernel-level event monitoring. Detects anomalous process execution, file system changes, and network connections.

Katran

Facebook's eBPF-based load balancer (used at edge). XDP-based, high performance (millions of packets per second), consistent hashing for load distribution.

eBPF networking tools ecosystem:
Tool            Primary Use                      eBPF Hooks Used
─────────────────────────────────────────────────────────────────────────────
Cilium          Kubernetes CNI, security, LB      XDP, TC, Socket, Cgroup
Calico eBPF     Kubernetes networking (iptables replacement) XDP, TC
Falco           Runtime security                  Tracepoints, kprobes
Katran          L4 load balancer                  XDP
Pixie           Kubernetes observability          kprobes, uprobes
Hubble          Kubernetes network observability  eBPF (via Cilium)
Merbridge       Service mesh acceleration         Sockops, redir

eBPF Networking Use Cases

Kubernetes Networking (CNI)

eBPF replaces iptables for service load balancing, pod-to-pod encryption, network policies (L3-L7 including HTTP, gRPC, Kafka), and observability with Hubble for flow logs, latency metrics, and service dependency graphs.

DDoS Mitigation

XDP programs drop attack packets at line rate before kernel stack, using rate limiting per IP address, SYN flood protection (tracking half-open connections), and dropping invalid packets.

Network Observability

eBPF gathers network metrics with low overhead: packet drops (where, why, which packet), latency per connection (trace TCP RTT), TCP retransmits and out-of-order packets, and per-application network traffic.

Service Load Balancing

eBPF load balancers (Katran, Cilium) support consistent hashing (keep client-to-backend affinity), direct server return (bypass load balancer for return traffic), and health checking with automatic backend removal.

Network Security

eBPF enforces drop traffic based on packet signatures (XDP), block container network access via cgroup hooks, rate limit per IP or per pod traffic (XDP or TC), and encrypt pod-to-pod traffic transparently (WireGuard in eBPF).

Performance comparison:
Technology              Throughput (Mpps)   Latency (ns)   CPU Overhead
─────────────────────────────────────────────────────────────────────────────
Linux iptables          5                   500            Moderate
Linux TC                8                   400            Low
eBPF TC                 15                  200            Very Low
eBPF XDP                25                  100            Minimal
DPDK                    30                  60             Minimal (kernel bypass)

Note: Numbers approximate, vary by hardware and configuration.
XDP is software-based (no kernel bypass), unlike DPDK requires application changes.

eBPF Networking Anti-Patterns

  • Complex Logic in XDP: XDP has strict limitations: no loops (unless bounded), limited stack size (512 bytes), no kernel helper calls for many operations. Move complex processing to TC or higher layers.
  • Not Considering Tail Calls: Single eBPF program limited to 1M instructions (may be insufficient for complex processing). Use tail calls to chain multiple programs.
  • Ignoring Verifier Constraints: eBPF verifier rejects programs with unreachable code, invalid memory access, potential infinite loops. Write simple, verifier-friendly code. Test program loading before deployment.
  • No Map Size Planning: Maps have fixed size at creation. Runaway map growth leads to ENOMEM failures and packet drops. Monitor map fullness, set max entries appropriately.
  • Overusing Global Variables: eBPF global variables are per-program (not per-CPU), require synchronization for writes. Use per-CPU maps for counters instead.
  • No Fallback Path: eBPF program may fail to load (unsupported kernel version, insufficient memory, verifier rejection). Design fallback path to traditional networking (iptables, standard stack).
eBPF program verification constraints:
Constraint              Limit / Rule
─────────────────────────────────────────────────────────────────────────────
Program Size            Max 1 million instructions
Stack Size              Max 512 bytes
Loop Support            Bounded loops only (since kernel 5.3)
                         Must have verifiable upper bound
Function Calls          Limited to helper functions (no arbitrary kernel calls)
Pointer Access          Must be bounds-checked
Arithmetic              No division by variable (must be constant)
Backward Jumps          Restricted (prevents infinite loops)

eBPF Networking Best Practices

  • Start Simple: Begin with simple eBPF programs (drop specific port, count packets). Gradually add complexity after understanding verifier constraints.
  • Use CO-RE (Compile Once, Run Everywhere): Kernel versions differ across hosts, making eBPF binary compatibility challenging. CO-RE uses BTF (BPF Type Format) for relocations. Libbpf and Cilium's ebpf Go library support CO-RE.
  • Monitor eBPF Errors: Trace eBPF program errors: check trace_pipe (/sys/kernel/debug/tracing/trace_pipe) for verifier failures, monitor map ENOMEM events, watch for XDP program exceptions (XDP_ABORTED).
  • Prefer XDP for Simple Fast Path, TC for Complex: XDP limitations: cannot modify packet size, limited helpers, no routing decision. TC supports more features but slower. Offload complexity to TC after XDP fast path.
  • Use Per-CPU Maps for Counters: Per-CPU maps avoid atomic operations (no locking), higher throughput for statistics counters. Use BPF_MAP_TYPE_PERCPU_HASH or BPF_MAP_TYPE_PERCPU_ARRAY.
  • Test eBPF Programs in Containers: Isolate eBPF development environment to avoid kernel crashes during testing. Use vagrant, libvirt, or Docker with privileged mode for eBPF program loading. Test across target kernel versions, especially after kernel upgrades.
  • Leverage Existing Libraries: libbpf for C eBPF programs (low-level), cilium/ebpf for Go (userspace loader), bpftrace for quick scripts (not production), BCC for Python prototyping.
eBPF development tools:
Tool            Language    Use Case
─────────────────────────────────────────────────────────────────────────────
libbpf          C           Low-level eBPF + CO-RE (production)
cilium/ebpf     Go          Loading eBPF from Go apps (userspace)
bpftrace        awk-like    Quick scripts for debugging/observability
BCC (bpf tools) Python/C    Prototyping, existing tools (trace, execsnoop)
bpftool         CLI         Inspect loaded eBPF programs, maps, debugging
bpftrace one-liner: bpftrace -e 'kprobe:tcp_v4_connect { printf("connect %s\n", comm); }'

eBPF vs Traditional Networking

Learning Curve
Aspect Traditional (iptables/TC) eBPF Networking
Performance Good Excellent (near-native)
Programmability Limited (fixed actions) Full (custom logic)
Safety Kernel modules risky Verifier-enforced safe
Dynamic Updates Slow (rule enumeration) Atomic, fast
Observability Limited logs Rich metrics from eBPF
Moderate (iptables) Steep (eBPF + verifier)

Frequently Asked Questions

  1. What is the difference between eBPF and DPDK?
    eBPF runs in kernel (sandboxed), safe but higher latency (microseconds), no application changes needed. DPDK bypasses kernel entirely, runs in userspace, lower latency (nanoseconds), extremely high throughput, but requires application rewrite to use DPDK, and bypasses kernel networking stack and security.
  2. Does eBPF require kernel recompilation?
    No. eBPF is built into modern Linux kernels (4.1+ basic, 5.x+ advanced features). eBPF programs load dynamically via bpf() syscall. Kernel recompilation or reboot is not required.
  3. Is eBPF safe for production?
    Yes, if program passes verifier. Verifier ensures memory safety, bound loops, no kernel crashes. Production eBPF is used by Facebook (load balancer, DDoS protection at scale), Google, Netflix, Cloudflare (DDoS mitigation).
  4. What Linux kernel version do I need for eBPF?
    Basic eBPF features: 4.1+, BPF maps, helper functions: 4.4+. XDP support: 4.8+. BTF (CO-RE): 5.2+. Full feature set: 5.8+. Use recent kernel (5.x) for best eBPF support.
  5. Can eBPF replace iptables?
    Yes, in many cases. eBPF provides higher performance, better scalability, and programmability. Cilium and Calico eBPF replace kube-proxy iptables. May migrate legacy iptables to eBPF for modern workloads.
  6. What should I learn next after eBPF networking?
    After mastering eBPF networking, explore Cilium for Kubernetes networking, eBPF for observability (Hubble, Pixie), XDP for high-performance packet processing, bpftrace for dynamic tracing, eBPF for runtime security (Falco, Tetragon), and Cilium Service Mesh.