Tech
Kubernetes
nsenter experiments
Last updated on Aug 06, 2025

experiment

During the process of building Kubernetes containers from scratch, I had some trouble understanding how nsenter works and how to use it effectively. I decided to experiment with nsenter and related tools to gain a better understanding of their behavior and usage patterns. This document summarizes my findings and provides a guide for others who may want to explore similar concepts. The nsenter version used is from alpine-minirootfs-3.20.3.

1. Set Up Minimal Root Filesystems

mkdir -p /root/tung/{pause,a0,a1}
cd /root/tung
# You may want to change to your OS architecture, eg. `x86_64` or `aarch64`
wget https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/aarch64/alpine-minirootfs-3.20.3-aarch64.tar.gz
tar -xzf alpine-minirootfs-3.20.3-aarch64.tar.gz -C pause
tar -xzf alpine-minirootfs-3.20.3-aarch64.tar.gz -C a0
tar -xzf alpine-minirootfs-3.20.3-aarch64.tar.gz -C a1

Create marker files to identify each root:

touch /a-host
touch /root/tung/pause/a-pause
touch /root/tung/a0/a-a0
touch /root/tung/a1/a-a1

2. Launch a Container in New Namespaces

Start a shell in new namespaces using unshare:

unshare -Cunimpf chroot /root/tung/pause /bin/sh
# in the new shell
mount -t proc proc /proc
ps
# should see
PID   USER     TIME  COMMAND
    1 root      0:00 /bin/sh
    3 root      0:00 ps
  • The unshare command creates a new PID namespace and a new mount namespace before the chroot and sh commands are executed
  • The --mount-proc flag tells unshare to automatically mount a new, private procfs (the proc filesystem) at /root/tung/pause/proc
  • --mount-proc=/root/tung/pause/proc doesn't work in my Ubuntu 22.04 with kernel version 5.15.0-141-generic, but works in Alpine Linux with kernel version 6.1.0-37-arm64.

3. Find the Container PID

In another terminal:

# find the PAUSE process's PID, the process that is forked from the unshare command
ps aux | grep /bin/sh
PAUSE_PID=<pid>
# change current dir
cd /root/tung

4. Explore nsenter Usage Patterns

Case 1: Run nsenter without arguments

nsenter
  • Program ${SHELL} is run by default, the value of ${SHELL} is get from current namespaces
  • ls shows files in /root/tung
  • pwd returns /root/tung
  • ls / shows host root, including /a-host
  • mount | grep /proc shows proc on /proc type proc
  • ps aux shows host's processes

Case 2: Change Root with --root

nsenter --root=/root/tung/a0 /bin/sh
  • The program /bin/sh must exist in the new root
  • --root changes the filesystem root like chroot, eg. chroot /root/tung/a0
  • ls shows files in /root/tung, even though the root is /root/tung/a0
    • The --root flag only changes the root directory for the new process, but it doesn't change the current working directory (CWD)
    • When we run the command above, the new shell's root directory is indeed set to /root/tung/a0. However, because nsenter doesn't change the CWD, the shell's current directory remains the same as the directory we were in when you executed the nsenter command
  • pwd returns empty
    • This is because of the combination of a changed root and a CWD that's outside of the new root
    • The pwd command attempts to show the absolute path of our current directory. However, since the current working directory /root/tung is not under the new root /root/tung/a0, pwd can't construct a valid path to display
  • ls / shows files in /root/tung/a0, should see a-a0
  • mount shows mount: no /proc/mounts
  • ps aux shows nothing
  • mount -t proc proc /proc runs successfully
    • ps aux shows host's processes
    • Clean: umount /proc

Case 3: Enter PAUSE's Namespaces

nsenter -t $PAUSE_PID
  • Same as Case 1, but inside PAUSE's namespaces.

Case 4: Set Root to PAUSE's Root (Fails if Shell Missing)

nsenter -t $PAUSE_PID -r
  • Program is not given, ${SHELL} is run, which is /bin/bash in debian by default
  • -r sets the root dir. If no dir is specified, set the root dir to the root dir of the target process, which is the PAUSE process
  • However, this command causes error nsenter: failed to execute /bin/bash: No such file or directory, because /bin/bash doesn't exist in the root / of the PAUSE process

Case 5: Explicitly Run /bin/sh in PAUSE's Root

nsenter -t $PAUSE_PID -r /bin/sh
  • Use -r to set the root dir to the PAUSE process's root dir, which is /root/tung/pause
  • ls shows files in /root/tung, even though the root is /root/tung/pause
    • Note: /proc/$PAUSE_PID/root is a symbolic link of /root/tung/pause
  • pwd returns empty
  • ls / shows files in /root/tung/pause, should see a-pause
  • mount shows mount: no /proc/mounts
    • The mount command, when run without arguments, looks for /proc/mounts file to list the currently mounted filesystems. Without the --mount/-m flag, nsenter does not enter the target's mount namespace. Therefore, the shell's view of the mounts is still the host's, and the host's mount table doesn't have a new procfs mounted at /proc within that specific chrooted environment
    • Hence, cat /proc/mounts shows cat: can't open '/proc/mounts': No such file or directory, even though ls /proc still shows PAUSE's processes
  • ps aux shows PAUSE's processes in the PAUSE's PID namespaces but not the current ps aux process
    • ps aux command (also top or htop) relies on the contents of the /proc filesystem, not /proc/mounts
  • mount -t proc proc /proc shows error mount: mounting proc on /proc failed: Invalid argument
    • This is because we didn't enter the new mount namespace first
    • When we run mount -t proc proc /proc without being in the correct mount namespace, the command tries to mount a procfs onto a directory that might already have a different filesystem mounted or isn't a valid mount point. In our case, the /root/tung/pause/proc is already mounted and we are trying to remount from the host's namespaces

Case 6: Change Root to a0 in PAUSE's Namespaces

nsenter -t $PAUSE_PID --root=/root/tung/a0 /bin/sh
  • ls shows files in /root/tung, even though the root is /root/tung/a0
  • pwd returns empty
  • ls / shows files in /root/tung/a0, shoule see a-a0
  • mount shows mount: no /proc/mounts
  • ps aux shows nothing
  • mount -t proc proc /proc runs successfully
    • ps aux shows host's processes
    • Clean: umount /proc

Case 7: Join All Namespaces of PAUSE and Change Root

nsenter -t $PAUSE_PID -a --root=/root/tung/a0 /bin/sh
  • ls shows files in host's root /, should see a-host, even though the root is /root/tung/a0
  • pwd returns empty
  • ls / shows files in /root/tung/a0, shoule see a-a0
  • mount shows mount: no /proc/mounts
    • This is because mount command looks for the content in /proc/mounts file, which is /root/tung/a0/proc, which is empty or unexists
  • ps aux shows nothing
    • This is because ps aux looks for the processes in /proc, which is /root/tung/a0/proc, which is empty
  • mount -t proc proc /proc shows error mount: mounting proc on /proc failed: Invalid argument
    • This error is a result of the environment
    • The mount command tries to mount a procfs at the /proc path, which is /root/tung/a0/proc
    • However, without a pre-existing mount point or a properly configured mount table, this action fails
    • The command likely fails because the procfs is a special filesystem that needs a valid mount point to be created
    • This is quite difficult to explain as we may need to look at the nsenter's implementation

Case 8: Join All Namespaces of PAUSE

nsenter -t $PAUSE_PID -a /bin/sh
  • ls shows files in host's root /, should see a-host
  • pwd shows host's root /
  • ls / shows files in host's root /, should see a-host
  • mount shows proc on /proc type proc and proc on /root/tung/pause/proc type proc
  • ps aux shows host's processes

Case 9: Join All Namespaces and Set Root to PAUSE's Root

nsenter -t $PAUSE_PID -a -r /bin/sh
  • ls shows files in host's root /, should see a-host, even though the root is /root/tung/pause
  • pwd returns empty
  • ls / shows files in /root/tung/pause, should see a-pause
  • mount shows proc on /proc type proc
  • ps shows:
    • 1 root 0:00 /bin/sh belonging to the /bin/sh command from unshare command
    • 5 root 0:00 /bin/sh belonging to the /bin/sh command from current nsenter process
    • 7 root 0:00 ps belonging to the current ps command
  • This case is somewhat what we expected, except that ls and pwd show weird output
    • To resolve this, we can run this instead nsenter -t $PAUSE_PID -a -r /bin/sh -c "cd /; exec /bin/sh", but this is quite complicated

Case 10: Change Root to PAUSE's Root Before Entering Namespaces

nsenter -t $PAUSE_PID -a --root=/root/tung/pause /bin/sh
  • ls shows files in host's root /, should see a-host, even though the root is /root/tung/pause
  • pwd returns empty
  • ls / shows files in /root/tung/pause, should see a-pause
  • mount shows mount: no /proc/mounts
    • Why when using nsenter -t $PAUSE_PID -a -r /bin/sh, mount shows proc on /proc type proc, but when using nsenter -t $PAUSE_PID -a --root=/root/tung/pause /bin/sh, mount shows mount: no /proc/mounts?
    • The problem is that the chroot operation with an explicit path happens before nsenter enters the target's mount namespace. When nsenter tries to use /root/tung/pause as the root, it's doing so from the context of the host's filesystem
    • The key difference is that -r is the correct and safe way to enter the target's root directory, as it's based on the actual process state. --root=... is flawed because it hardcodes a path on the host filesystem that doesn't correspond to the new mount namespace's root, causing the chroot operation to fail to find the correct procfs mount
  • mount -t proc proc /proc shows error mount: mounting proc on /proc failed: Invalid argument
    • The procfs is already mounted because we are in the PAUSE's namespaces

Case 11: Create Container a0 using chroot in PAUSE's Namespaces

nsenter -t $PAUSE_PID -a chroot /root/tung/a0 /bin/sh
  • ls shows files in /root/tung/a0, should see a-a0
  • pwd returns /
  • ls / shows files in /root/tung/a0, should see a-a0
  • mount shows mount: no /proc/mounts
  • mount -t proc proc /proc runs successfully
  • ps shows:
    • 1 root 0:00 /bin/sh belonging to the /bin/sh command from unshare command
    • 5 root 0:00 /bin/sh belonging to the /bin/sh command from current nsenter process
    • 7 root 0:00 ps belonging to the current ps command
  • Clean by running umount /proc, otherwise the /root/tung/a0 will always be mounted inside the PAUSE's namespaces
    • Note: in host's namespaces, /proc is not mounted in /root/tung/a0/proc, but in PAUSE's namespaces, /proc is mounted in /root/tung/a0/proc. We can verify by going to host's namespaces and run ls /root/tung/a0/proc, it shows empty
    • If we don't want to clean, we must use overylayfs, eg. mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work merged

With nsenter with chroot, we first enter all namespaces of the target process (including mount namespace), and then run chroot inside those namespaces. This means our shell is running with the same mount namespace as the PAUSE process, and we have the correct permissions and context to perform mounts like mount -t proc proc /proc.

CommandWhen is root changed?Which /proc do you see?
--root=/root/tung/pauseBefore entering namespacesHost’s (may be empty/missing)
chroot /root/tung/pause or chroot /proc/$PAUSE_PID/rootAfter entering namespacesPAUSE’s (mounted and populated)

This is is exactly what we expected when we want to create container A0 in the same namepsaces with the container PAUSE.

Case 12: Use PAUSE Container in Its Own Namespaces

nsenter -t $PAUSE_PID -a chroot /root/tung/pause /bin/sh
  • ls shows files in /root/tung/pause, should see a-pause
  • pwd returns /
  • ls / shows files in /root/tung/pause, should see a-pause
  • ps shows:
    • 1 root 0:00 /bin/sh belonging to the /bin/sh command from unshare command
    • 5 root 0:00 /bin/sh belonging to the /bin/sh command from current nsenter process
    • 7 root 0:00 ps belonging to the current ps command

This is is exactly what we expected when we want to execute commands in a container.

5. Key Learnings

  • Use nsenter to join namespaces of another process
  • Use chroot to change the filesystem root is recommended, eg. cases 11, 12
  • Use --root/-r is not recommended to change the filesystem root as it will change the root before entering the process's namespaces, eg. cases 2, 5, 7, 10