Linux Network Namespaces
14 Mar 2023
Network namespaces entered the Linux kernel in version 2.6.24. They partition the use of the system resources associated with networking—network devices, addresses, ports, routes, firewall rules, etc.—into separate boxes, essentially virtualizing the network within a single running kernel instance1.
Once upon a time, there was a patch set called “process containers”. The “container subsystem” allows an administrator to group processes into hierarchies of “containers”; each hierarchy is managed by one or more “subsystems.” As of 2.6.23, virtualization is quite well supported on Linux, at least for the x86 architecture. Containers lag a little behind, instead. In 2.6.24, “containers” were renamed “control groups.” It makes sense to pair “control groups” with the management of the various namespaces and resource management in general to create a framework for a container implementation2.
Current container implementations use network namespaces to give each container its own view of the network, untrammeled by processes outside of the container. The network namespace patches merged for 2.6.24 added a line to <linux/sched.h>
:
#define CLONE_NEWNET 0x40000000 /* New network namespace */
A central structure defined in <include/net/net_namespace.h>
was used to keep track of all available network namespaces:
struct net {
atomic_t count; /* To decided when the network
* namespace should be freed
*/
atomic_t use_count; /* To track references we
* destroy on demand
*/
struct list_head list; /* list of network namespaces */
struct work_struct work; /* work struct for freeing */
struct proc_dir_entry *proc_net;
struct proc_dir_entry *proc_net_stat;
struct proc_dir_entry *proc_net_root;
struct net_device *loopback_dev; /* The loopback */
struct list_head dev_base_head;
struct hlist_head *dev_name_head;
struct hlist_head *dev_index_head;
};
Table of Contents
Establish Communication Between Network Namespaces
Let’s first ssh
into an Amazon EC2 instance and execute ip link show
or ip link list
to display the state of all network interfaces on the system:
$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 12:57:06:38:46:eb brd ff:ff:ff:ff:ff:ff
where lo
is the loopback interface and eth0
is the first Ethernet network interface. Now, let’s create two network namespaces:
$ sudo ip netns add ns-1
$ sudo ip netns add ns-2
$ ip netns list
ns-2
ns-1
Compare the ARP table of the global namespace and our manually created namespaces:
$ arp
Address HWtype HWaddress Flags Mask Iface
ip-172-31-80-1.ec2.inte ether 12:19:04:c5:6c:6d C eth0
$ sudo ip netns exec ns-1 arp # nothing
$ sudo ip netns exec ns-2 arp # nothing
To enable communication between these two network namespaces, we can make use of a virtual Ethernet (veth
) device, which is a local Ethernet tunnel created in interconnected pairs. Let’s create a veth
pair using the following command:
sudo ip link add veth-1 type veth peer name veth-2
Issue the ip link show
command again to verify that the veth
pair has been successfully created:
$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 12:57:06:38:46:eb brd ff:ff:ff:ff:ff:ff
3: veth-2@veth-1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 6e:ee:6b:30:5f:bf brd ff:ff:ff:ff:ff:ff
4: veth-1@veth-2: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 92:0a:ed:2d:30:a7 brd ff:ff:ff:ff:ff:ff
The veth
pair, at this moment, exists within the global network namespace. We need to place one end of the veth
pair in ns-1
and the other end in ns-2
:
sudo ip link set veth-1 netns ns-1
sudo ip link set veth-2 netns ns-2
Subsequently, the ip link show
command should only display the initial two network devices. To view the veth
pair we have just created, execute the ip link show
command inside the corresponding network namespace, for example:
$ sudo ip netns exec ns-1 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: veth-1@if3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 92:0a:ed:2d:30:a7 brd ff:ff:ff:ff:ff:ff link-netns ns-2
Let’s assign each veth
interface a unique IP address and bring it up:
sudo ip netns exec ns-1 ip addr add 10.0.0.10/16 dev veth-1 && sudo ip netns exec ns-1 ip link set dev veth-1 up
sudo ip netns exec ns-2 ip addr add 10.0.0.20/16 dev veth-2 && sudo ip netns exec ns-2 ip link set dev veth-2 up
Now, we can open two separate terminals, one for each network namespace:
# Terminal 1
sudo ip netns exec ns-1 bash
# Terminal 2
sudo ip netns exec ns-2 bash
Execute the following command in Terminal 2 to capture the ICMP traffic using tcpdump
:
$ sudo ip netns exec ns-2 tcpdump -i veth-2 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth-2, link-type EN10MB (Ethernet), capture size 262144 bytes
Then, in Terminal 1, type:
ping 10.0.0.20
We should see in Terminal 2 the following output:
listening on veth-2, link-type EN10MB (Ethernet), capture size 262144 bytes
20:30:26.942779 IP 10.0.0.10 > ip-172-31-95-167.ec2.internal: ICMP echo request, id 3892, seq 1, length 64
20:30:26.942790 IP ip-172-31-95-167.ec2.internal > 10.0.0.10: ICMP echo reply, id 3892, seq 1, length 64
20:30:27.944142 IP 10.0.0.10 > ip-172-31-95-167.ec2.internal: ICMP echo request, id 3892, seq 2, length 64
20:30:27.944162 IP ip-172-31-95-167.ec2.internal > 10.0.0.10: ICMP echo reply, id 3892, seq 2, length 64
20:30:28.956516 IP 10.0.0.10 > ip-172-31-95-167.ec2.internal: ICMP echo request, id 3892, seq 3, length 64
20:30:28.956535 IP ip-172-31-95-167.ec2.internal > 10.0.0.10: ICMP echo reply, id 3892, seq 3, length 64
20:30:29.980557 IP 10.0.0.10 > ip-172-31-95-167.ec2.internal: ICMP echo request, id 3892, seq 4, length 64
20:30:29.980577 IP ip-172-31-95-167.ec2.internal > 10.0.0.10: ICMP echo reply, id 3892, seq 4, length 64
20:30:31.004562 IP 10.0.0.10 > ip-172-31-95-167.ec2.internal: ICMP echo request, id 3892, seq 5, length 64
20:30:31.004582 IP ip-172-31-95-167.ec2.internal > 10.0.0.10: ICMP echo reply, id 3892, seq 5, length 64
Inside either of the network namespaces, let’s take a look at the current state of the ARP table:
# ip --all netns exec arp
netns: ns-2
Address HWtype HWaddress Flags Mask Iface
10.0.0.10 ether a6:ae:c7:74:56:b8 C veth-2
netns: ns-1
Address HWtype HWaddress Flags Mask Iface
10.0.0.20 ether 06:22:d8:20:50:0b C veth-1
Also, inspect the routing table and packet filtering rules within either of the network namespaces:
$ sudo ip netns exec ns-1 route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.0.0.0 0.0.0.0 255.255.0.0 U 0 0 0 veth-1
$ sudo ip netns exec ns-1 iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
The iptables -L
command by default displays the filter
table, which is, at this moment, empty. An ACCEPT
target is specified by each chain.
Delete our manually created network namespaces:
sudo ip netns delete ns-1
sudo ip netns delete ns-2
Footnotes
-
See Jake Edge, “Namespaces in operation, part 7: Network namespaces,” January 22, 2014. This Linux manual page can also serve as a good starting point. ⤴
-
See Jonathan Corbet, “Notes from a container,” October 29, 2007. ⤴