Best practice and common sense tells us that we should always avoid running commands and services as root whenever possible. Sound good, but how do we do that in Docker containers? Unfortunately, there isn't a single standard way to do it. Docker leaves it up to each image developer and each operator to manage how the container should be run. Much of the time, it involves passing UIDs and GIDs to a container via environment variables, but that's not always the case. And what are UIDs and GIDs anyways? Should you create a user on the host first? These are all great questions that aren't often covered. In this post, I hope to answer these questions are give a primer on user permissions in Docker so you're prepared to ask the right questions.
Note: This post applies to using Docker on Linux. It should apply to other OSes that use a Linux VM, but how things are implemented may vary slightly. Bind mounts in Docker for MacOS for example, are implemented using NFS mounts.
First, let's very briefly review what a container is. Without going into too much detail, I find it helpful to think of a Docker container as simply an isolated environment to run a command in. That command could be bash, or it could a web server like lighttpd, but the key take-away is that a we run a single command in an isolated environment.
Isolation (as opposed to virtualization) in this context means that containers share the same kernel with the host. They just have different "views" of kernel resources partitioned using Linux namespaces.
UIDs and GIDs
That brings us to the important topic of UIDs and GIDs. Every user and group in Linux has an associated numeric identifier called a UID and GID, respectively. Whenever permissions are assigned or checked, it's always the numeric IDs that are used, not the user or group names.
The important thing to note when it comes to Docker containers is that UIDs and GIDs are used in the Kernel, while user name and group names are handled outside of the Kernel (usually in the
/etc/group files). That means Docker containers share the same list of UIDs and GIDs with the host and other containers, but they don't have the same list of user and group names as the host. So a user named brian with UID 1001 on a host, for example, may very well have a different user name associated to it within a container.
Sharing UIDs/GIDs with the host can be problematic, especially if the process inside the container is run as the default UID 0, or root. Even for non-root users though, care should be taken to ensure the process running inside container isn't owned by a privileged UID. Otherwise, potential vulnerabilities in code running inside the container could allow access to sensitive information, denial of service attacks, or privilege escalation.
Running as Non-Root Users
Running as a non-root user is always a good idea whenever possible. Unfortunately, it's sometime easier said than done. That's because there are a few different methods available to handling permissions in Docker containers, and the procedure for doing so is not universal across all container images.
The Dockerfile USER Instruction
First, the USER instruction that can be used in a Dockerfiles when building an image changes the current user during the build. By setting the USER to a non-root UID, in the container image before the ENTRYPOINT and/or CMD instruction, the container effectively starts as that user. In fact, any command executed during the build after the USER instruction, executes as the UID provided.
The USER instruction is, of course, is only used in a Dockerfile when the container image is built so it's static and cannot be changed once the image is built. Instead, it can be overridden at runtime as we'll see next.
The --user Option
--user option of the
docker run command allows you to override the USER directive in the image when running an image You provide UID (and optionally, the GID) as an argument and it executes the command inside the container as that UID. Note that the UID/GID provided does not actually need to have an associated user in
/etc/passwd on the host or on the container.
It's important to note that this can break permissions on files inside the container. If a user with UID 1001 is provided, but certain necessary files inside the container can't be read by UID 1001, you may see failures. A prior understanding of how the container operates is often required.
Consider this example with the Nginx container image.
$$ docker run -it --rm --user 5000:5000 nginx ls -la /usr/sbin/nginx -rwxr-xr-x 1 root root 1330984 Mar 3 2020 /usr/sbin/nginx $ $ docker run -it --rm --user 5000:5000 nginx nginx 2021/01/03 20:05:50 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2 2021/01/03 20:05:50 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied) nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
In the example above, we started an Nginx container with the
--useroption and set the UID and GID to 5000 and checked the permissions on the
nginx binary. Worth noting is that UID/GID 5000 does not exist in the host or the container's
/etc/passwd file, yet it doesn't have to since only UIDs/GIDs are used when assigning or checking file permissions. In the output, we see that the nginx` binary can be executed by "other" so our UID 5000 will still be able to execute it.
Next, we started another container with the same options to try it out, this time executing
nginx binary as the command instead. While we did have permissions to run the executable, it still exited with an error because we failed to follow the instructions for this particular image.
According to the docker-nginx README, "it is possible to run the image as a less privileged arbitrary UID/GID. This, however, requires modification of nginx configuration to use directories writeable by that specific UID/GID pair"
Utilizing an Entrypoint Script
A very common pattern in container image development is to run an "entrypoint" shell script as the container's command. The shell script ensures that certain setup actions or checks always occur when the image is started, before forking and replacing the actual command to be run.
For example, many container images retrieve environment variables to override default configuration options. It's common for container images to allow setting the UID and GID via an environment variable in this way. Some containers even implement a check to ensure that the current UID isn't 0 or root. Once again, note that the UID/GID provided does not actually need to have an associated user in
/etc/passwd on the host.
While it's a very commonly used and powerful tool, it's also not universal. How it's handled is dependent on how the developers chose to implement it. Quick-start guides provided in the image's README pages, also rarely explain what's recommended or what is the default.
As with the --user option, providing an arbitrary UID can break things. A prior understanding of how the container operates is often required. Using both the
--user and environment variables together can also be problematic as the entrypoint script will run as the UID provided in the
--user option. Generally, you wouldn't do that, but again it's always specific to the image itself, and what the entrypoint script does.
Linux User Namespaces
Lastly, we can also utilize the Linux User Namespaces kernel feature which provides isolation of defined subordinate UIDs and GIDs by mapping them to "real" ones. In this way, UID 0 (root) in a container can be mapped to a non-root parent UID on the host, as an example.
However, using User Namespaces has specific prequisites and isn't utilized by default Docker installations. When used, it's enabled for the Docker daemon and applies to all containers. For this reason, it's not as commonly used or talked about. For more information, I'd recommend reading through the Docker documentation.
Permissions Problems with Mounts
When dealing with permissions in Docker, problems can commonly occur when using bind mounts and volumes. While of course, not specific to only files mounted inside a container, that's likely going to be where you'll encounter permissions problems most often. Since bind-mounts and volumes are commonplace, it's worth spending some time talking about specifically.
What you need to remember for bind-mounts and volumes, is that the UID/GID of the running process inside the container will still require permissions on the mapped directories in the host filesystem. Even if there's no associated user name in
/etc/passwd on the host, the mounted files will still need to permit the appropriate access to the UID/GID of the running process inside in the container. Again, it's the UID/GID that are checked, not the user/group name. You don't specifically need to create the user on the host, you just need to make sure the UID/GID used inside the container also has the appropriate permissions to the files it uses--whether inside or outside the container.
Consider this example using a bind mount.
$ docker run -it --rm --user 5000:5000 -v ~/test:/test nginx touch /test/file touch: cannot touch '/test/file': Permission denied $ $ ls -la ~/ | grep test drwxr-xr-x 2 root root 4096 Jan 3 10:13 test $ $ docker run -it --rm --user 5000:5000 -v ~/test:/test nginx ls -la / | grep test drwxr-xr-x 2 root root 4096 Jan 3 10:31 test
In the example, we started an Nginx container using the
--user option to set the UID and GID to 5000 and attached a bind-mounted volume. For the command, we attempted to create a file in the bind mounted directory using the
touch /test/file command. The command failed with a "Permission Denied" error because the
~/test/ directory both on the host and in the container was owned by root and the UID/GID 5000 didn't have write permissions.
To correct this problem, we can change the ownership of the
test directory on the host as shown below. After doing so, we no longer encounter a permissions error and the file is created successfully
$ sudo chown 5000:5000 test $ $ ls -la ~/ | grep test drwxr-xr-x 2 5000 5000 4096 Jan 3 10:13 test $ $ docker run -it --rm --user 5000:5000 -v ~/test:/test nginx ls -la / | grep test drwxr-xr-x 2 5000 5000 4096 Jan 3 10:31 test $ $ docker run -it --rm --user 5000:5000 -v ~/test:/test nginx touch /test/file $
Just to reiterate on this point again, the UID/GID 5000 did not exist in the host or the container's
/etc/passwd file. It doesn't have to, since only UIDs/GIDs are used when assigning or checking file permissions. Additionally, the reason we were still able to run the
touch command is because it can be executed by "other" as shown below.
$ docker run -it --rm --user 5000:5000 -v ~/test:/test nginx ls -la /bin/touch -rwxr-xr-x 1 root root 97152 Feb 28 2019 /bin/touch