Sanitized Input

Docker and netns

For the most part living in a container world is awesome. Things are safer and more predictable. But sometimes you get stuck on one of those 'weird' problems. It worked in dev, it worked in your test environment, it works in production... except that one cluster... whats going on?!

If you've been around for a while you may have gotten used to linux tools that allow you to deep dive on how a particular application is behaving. This usually doesn't solve the problem but it does get you from 'something somewhere must be different' to 'its a problem with X'.

One of the tools in my toolbox was ip netns. If you are unfamiliar with the command I encourage you to read the man-page. The super short description is it allows you to manage different network namespaces as well as execute commands from within different network namespaces. Since docker containers will have their own network namespace you can see the utility of being able to step into the network namespace of a container without needing to exec into the container (where there may not be any tools, or even no shell at all).

But, I do not mean to teach you how to use ip netns exactly. This post is to show you how, with a bit of effort, you can use it to troubleshoot your docker containers.

The Problem

Lets pretend that I have a kubernetes pod that uses an envoy sidecar and something has gone sideways. After ruling out the easy stuff we still don't know exactly what is going on although we assume that it must be some interaction between the envoy sidecar and the application. Unfortunately, the communication between these two applications never leaves the pod. Everything we see leaving the host will be from the envoy sidecar. Wouldn't it be nice to be able to see use tools like tcpdump and netstat to help narrow the issue? Unfortunately, those tools aren't present in our container. In fact, our container doesn't even have a shell.

A perfect use case for ip netns! There is just one problem...

chris@lab-worker-01:~$ sudo ip netns ls
chris@lab-worker-01:~$

We don't see any network namespaces with the tool!

A Solution

The reason we don't see any of our container network namespaces with ip netns is that docker does not put them in /var/run/netns where the tool expects to find them. In a few steps we can correct that.

First we need to identify some information about the docker container we are interested in. In my case, the container is managed by kubernetes, so it makes sense to use the pause container.

chris@lab-worker-01:~$ docker ps | grep sanitized | grep pause
36760ba729fd   k8s.gcr.io/pause:3.1                "/pause"                 2 hours ago    Up 2 hours              k8s_POD_sanitized-589d79c675-96494_sanitized_55743fe1-ff75-4ce5-81ed-963e3a1d1297_0
chris@lab-worker-01:~$

Now we can use docker inspect to get the full pod id and its pid.

chris@lab-worker-01:~$ docker inspect 36760ba729fd |egrep "Id|Pid"
"Id": "36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333",
    "Pid": 2726664,
    "PidMode": "",
    "PidsLimit": null,
chris@lab-worker-01:~$

ip netns will look in /var/run/netns for network namespaces so lets create that folder if it doesn't already exist.
```
chris@lab-worker-01:~$ stat /var/run/netns || sudo mkdir -p /var/run/netns
chris@lab-worker-01:~$ 
```

We will use the pod id as a uniqe identifier for netns. Lets create a file for it.

chris@lab-worker-01:~$ sudo touch /var/run/netns/36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333
chris@lab-worker-01:~$

The final step, we will bind mount the network namespace to where the ip netns command expects it. We can find it in proc using the pid we identified from docker inspect. The path should be /proc/<pid>/ns/net.

chris@lab-worker-01:~$ sudo mount -o bind /proc/2726664/ns/net /var/run/netns/36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333
chris@lab-worker-01:~$ sudo ip netns ls
36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333 (id: 1)
chris@lab-worker-01:~$

Success!!

Now you can run commands from the container's network namespace without being in the container! To prove that we are in the pods network namespace lets see the ip information in this namespace.

chris@lab-worker-01:~$ sudo ip netns exec 36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
3: eth0@if168: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UP group default 
    link/ether 0e:64:dd:68:8a:45 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.60.248.186/32 brd 10.60.248.186 scope global eth0
        valid_lft forever preferred_lft forever
chris@lab-worker-01:~$

And now to compare it to what kubernetes tells us about the pod

[kube-lab]chris@milliways:~$ kubectl get po -o wide -n sanitized sanitized-589d79c675-96494
NAME                         READY   STATUS    RESTARTS   AGE    IP              NODE            NOMINATED NODE   READINESS GATES
sanitized-589d79c675-96494   1/1     Running   0          152m   10.60.248.186   lab-worker-01              
[kube-lab]chris@milliways:~$

Now you can do all kinds of interesting things like packet capture on the containers loopback, look at network stats with netstat, etc.

When you are done debugging its probably a good idea to clean up after yourself.

chris@lab-worker-01:~$ sudo umount /var/run/netns/36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333
chris@lab-worker-01:~$ 
chris@lab-worker-01:~$ sudo rm /var/run/netns/36760ba729fde97c4edd051637c18d89efde149e0d65386eee13eb26dd187333
chris@lab-worker-01:~$ sudo ip netns ls
chris@lab-worker-01:~$

I hope you find this useful and I saved you a bit of frustration!