The goal of this blog is to understand and document the security implications of user namespaces. The primary goal it serves is to perform comparison between two processes running on a Linux system as different unprivileged users in the root user namespace and an unprivileged user namespace (i.e. user namespace where uid 0 corresponds to an unprivileged user on the host).
What is user namespace?
- Allow running users as root with UID 0 but no real privilege to affect the host system of neighboring containers.
- Prevent exposure of actual root user to the container
Since we are only talking about security benefits, we won’t go into details about other benefits yet, but there are more reasons than this for you to use user-namspaces. For example, you can create your container images with an assumption that it will always be running as a specific user, say UID 1000, and then that can be remapped to something else at runtime.The support for filesystems to shift file ownerships is currently limited(read: non-existent), so you will end up having to chown the files to shift their UID/GID based on the range mapping you choose. There are are solutions like shiftfs that are currently being evaluated by the community, but I will write up another blog post on this topic.
OK, I am back on the security issue.
What security implications does user namespaces have?
Let’s dive deeper into the two points that I mentioned in the previous section. But before that, let’s go back a bit in history to understand where does the need for user namespace stems from.
Long long time ago, there was one single ruler in the land of Linux kernel, root. This ruler could do anything and everything, which wasn’t really a big concern, except for the fact that there wasn’t a way to give someone access to a limited set of powers without making them all so powerful root.Hence the powers were later broke down into a bunch of capabilities (except the one powerful CAP_SYS_ADMIN who is still the ruler of Linux). These capabilities were meant to both strip powers from root and provide smaller powers to usual citizens(i.e. unprivileged processes) for whatever valid use cases that they might have (for example, apache or nginx doesn’t need to run as root to bind to a privileged port Like 80 or 443, it should just need CAP_NET_BIND_SERVICE. This worked file for most people for some time, until containers showed up. There was a need to perform privileged operations inside of a container, which would only affect the ecosystem of container and not anything outside. So, a new namespace was created that could allow unprivileged processes to pretend to be root inside the confines of their own namespace. This particular namespace was meant to create namespaced (read: fake) users and capabilities.
Namespaced user and capabilities
What does that even mean? It means that UID 99 in the namespace could actually be UID 5000000 on the host. But, the biggest implication is that you can pretend to even have UID 0 inside that namespace but actually being UID 500000 on the host. This is where several interesting possibilities open up.
Similarly, a user with UID 1000 in the host can have CAP_SYS_ADMIN capabilities inside the user namespace and will be able to do things like run
mount command to mount certain types of filesystems that they otherwise wouldn’t have been able to do. Note that it still doesn’t provide them with full set of capabilities to affect host filesystem.
Consider an application running inside in a container, as an unprivileged user, with all capabilities dropped along with
noNewPrivileges enabled, so that their bounding set doesn’t include anything either. Now, there are two possible situations for this container,
- one with user-namespaces enabled such that the root user inside the container is mapped to an unprivileged user on the host,
- And second when user-namespaces is disabled and that the root user inside the container is mapped to the root user on the host.
The threat model that we are working on is an attacker who is able to compromise an application’s to become root. That is assumed to be possible, because we don’t want to spend time bike shedding all different ways to escalate privileges through application level vulnerabilities. Now, let’s compare the possible attack vectors.
Attack Vectors: No user-namespaces
Without user-namespace when a user escalates to root, they are real root, albeit with no capabilities. This user has access to filesystem and hence can at least read files owned by root user, but they are in a mount namespace. They can read Kernel memory or processes’ memory running as root on the host, if only they weren’t in the PID namespace.
If you look at the patterns here, there is always a single line of defense preventing this user from gaining control of the system. If the history has taught us anything, Linux has bugs. Any vulnerability that allows this attacker to escape out from either one of the 7 namespaces can allow potentially bad things to happen.
- Escaping mount namespace could allow them to mount rogue filesystems or read potentially root-owned files
- Escaping PID namespace would allow them to read other process’ memory running with root privileges
- Escaping network namespace would allow them to control firewall rules and open potential security holes. They can also control the packets.
- Escaping UTS namespace: I can’t think of anything too bad on top of network namespace
- Escaping cgroup namespace would allow them perform DoS on the system
- Escaping IPC namespace could allow them to send kill signals to critical processes on host
Finally, it is also possible an attacker to re-gain certain privileges that were dropped off. Simply being able to create a user-namespace adds CAP_SYS_ADMIN capability (every first process in a user namespace has CAP_SYS_ADMIN), albeit with limited functionality enabled. Now we are in a very weird territory of what precise privileges does the attacker have. Although, there have been some serious attacks that can be mounted with this limited functionality too.
Attack Vectors: User-namespaces
With user-namespace, configured in a way that root user from within the namespace is mapped to an unprivileged user on the host, when the attacker escalates to root, they are still essentially an unprivileged user. So, even if they escape the container, with all the attack vectors that are mentioned in the previous section, they still can’t do anything that is privileged. For them to mount an attack on the host, there is still a need to find another privilege escalation vulnerability to escalate to root on the host. Simply put, user namespace adds a strong 2nd layer of security on top, which can be exploited only through bugs in the Kernel and not the application.
The stinky bit
We talked about benefits of user namespaces, but now let’s talk a bit about what’s wrong with user namespaces. The way user-namespaces are currently implemented in Linux (this is mostly anecdotal, I am not a Kernel developer, neither have I read the source code related to user-namespace), users are granted extra capabilities inside a user-namspace that allows them to do things that they aren’t allowed to do on host.
This opens up a whole lot of code path that was previously only accessible to unprivileged users. For example, previously, only the root user could create namespaces. However, an exception was added for user-namespace, so now any unprivileged user can create just a user-namespace. This elevates their privilege inside that namespace which allows them to create any other namespace. This is how a rootless container works. This 2-step process has been simplified to make a 1-step process, so when CLONE_NEWUSER in included in a
unshare(2) call, you can also specify CLONE_NEW* for any namespace.
The bugs are being ironed out as they are being found, which is rather quickly due to high demand of container images to be used throughout the industry. People would like them to be secure along with functional. But whether user-namespace prevents more attacks than it enables is a difficult question to answer right now because there are some very serious pros and cons.
However, all vulnerabilities aren’t created equal and I believe the attacks it prevents to due to vulnerable software allowing privilege escalation to be of more value than fear of zero day against user namespaces.