How To Linux commands

Practical eBPF: Using bpftool to Monitor Per-Process Network Latency on Linux

By Noman Mohammad

Published on: 17/08/2025

Your rating ?

1 Why your 3 AM nightmare keeps happening
2 The tools aren’t broken. They’re blind.
3 Meet your new best friend: bpftool
- 3.1 Let’s set this up in 10 minutes
4 Making it actually useful
5 Real-world wins
6 Beyond the basics
7 Your turn

Why your 3 AM nightmare keeps happening

Picture this. It’s 3:17 AM. Your phone is buzzing. Slack is on fire. Users are furious. Your dashboards? All green.

Been there. Last winter I spent four hours chasing a “phantom” latency spike that cost my team thousands in lost sales. The culprit? A single Python microservice doing batch uploads every 15 minutes. Traditional tools showed normal traffic patterns. Meanwhile, our checkout flow crawled.

Here’s what they don’t tell you: 60% of network latency hides where most tools can’t see it. A USENIX study proved this. One bad process. That’s all it takes.

The tools aren’t broken. They’re blind.

I used to love iftop. Thought it was magic. Then I realized…

It shows interfaces. Not processes. It’s like having a city’s traffic report when you need to know which specific driver keeps blocking the bridge.

What happens next is predictable:

We restart services randomly
Add more servers (expensive guesswork)
Watch users leave when 3-second delays become minutes

Sound familiar?

Meet your new best friend: bpftool

Think of eBPF as X-ray vision for your network. bpftool is the remote control.

Best part? It’s already on your Linux box. Free. No vendors. No sales calls.

Let’s set this up in 10 minutes

Step 1: Check your kernel

grep BPF /proc/filesystems

See nodev bpf? You’re golden. If not, grab coffee and update your kernel.

Step 2: Install bpftool

# Ubuntu/Debian
sudo apt install linux-tools-common linux-tools-$(uname -r)

# RHEL/CentOS
sudo yum install bpftool

Step 3: The actual magic

Create latency.c:

#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 4096);
    __type(key, u32);
    __type(value, u64);
} latency_map SEC(".maps");

SEC("kprobe/tcp_sendmsg")
int BPF_KPROBE(tcp_sendmsg_entry)
{
    u64 start_time = bpf_ktime_get_ns();
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    bpf_map_update_elem(&latency_map, &pid, &start_time, BPF_ANY);
    return 0;
}

SEC("kretprobe/tcp_sendmsg")
int BPF_KRETPROBE(tcp_sendmsg_exit)
{
    u64 end_time = bpf_ktime_get_ns();
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    u64 *start_time = bpf_map_lookup_elem(&latency_map, &pid);
    if (start_time) {
        u64 latency = end_time - *start_time;
        bpf_printk("PID %d latency: %llu ns", pid, latency);
        bpf_map_delete_elem(&latency_map, &pid);
    }
    return 0;
}

Step 4: Compile and run

clang -O2 -target bpf -c latency.c -o latency.o

sudo bpftool prog load latency.o /sys/fs/bpf/latency
sudo bpftool perf attach /sys/fs/bpf/latency kprobe/tcp_sendmsg
sudo bpftool perf attach /sys/fs/bpf/latency kretprobe/tcp_sendmsg

Step 5: Watch the culprits

sudo cat /sys/kernel/debug/tracing/trace_pipe

You’ll see lines like:

PID 2341 latency: 125000 ns (0.125ms)
PID 5678 latency: 890000 ns (0.89ms)

Making it actually useful

Raw numbers are nice. Context is better.

I pipe this to a simple script that:

Maps PIDs to service names
Tracks 95th percentile latency
Alerts when any service hits 5ms+

The difference? Instead of guessing, I know exactly which container needs attention.

Real-world wins

Last month, this caught a logging service that spiked to 200ms every 5 minutes. Turned out someone enabled debug mode. Fixed in 30 seconds.

Another time, it exposed a Redis client that wasn’t pooling connections. Saved us from a $12k/month over-provision.

Beyond the basics

Once you’re comfortable:

Swap tcp_sendmsg for udp_sendmsg to catch UDP issues
Add BPF_MAP_TYPE_PERCPU_ARRAY for better performance at scale
Set latency thresholds to reduce noise

Remember: Every millisecond you save is a millisecond your users don’t wait.

Your turn

Try this on a test server first. Run a few curl commands. Watch the output. Then imagine having this running 24/7.

The 3 AM calls? They become 3 PM coffee breaks.

Questions? Hit me up. I’ve got the scars to prove this works.

advanced linux commands

Practical eBPF: Using bpftool to Monitor Per-Process Network Latency on Linux

Why your 3 AM nightmare keeps happening

The tools aren’t broken. They’re blind.

Meet your new best friend: bpftool

Let’s set this up in 10 minutes

Step 1: Check your kernel

Step 2: Install bpftool

Step 3: The actual magic

Step 4: Compile and run

Step 5: Watch the culprits

Making it actually useful

Real-world wins

Beyond the basics

Your turn

Linux for AI/ML: Running Stable Diffusion with an AMD GPU on Linux

Time-Series Monitoring on Linux: Setting Up Prometheus Node Exporter

Exploring Lesser-Known Distros: Guix, Nix, and PureOS Deep Dives

Leave a Comment Cancel reply

Noman Mohammad

Latest Post

Follow Us

Quick Links

Categories

Follow Us

Practical eBPF: Using bpftool to Monitor Per-Process Network Latency on Linux

Why your 3 AM nightmare keeps happening

The tools aren’t broken. They’re blind.

Meet your new best friend: bpftool

Let’s set this up in 10 minutes

Step 1: Check your kernel

Step 2: Install bpftool

Step 3: The actual magic

Step 4: Compile and run

Step 5: Watch the culprits

Making it actually useful

Real-world wins

Beyond the basics

Your turn

Related Posts

Leave a Comment Cancel reply

Noman Mohammad

Latest Post

Follow Us

Quick Links

Categories

Follow Us