Automating rootless Docker host updates with Ansible

V__ · 2025-11-22T13:37:34 1763818654

As much as I enjoy the advantages which for example docker compose brings. I feel it lacks when it comes to deployment, especially when using it rootless or on rootless images. I wish I could configure docker to just create a user for me based on the project name and make sure the permissions for the volumes are fine when I run compose up.

Nextgrid · 2025-11-22T21:43:27 1763847807

Rootless containers make no sense to me:

First scenario: the machine is single-purpose and protects a single asset (confidential data, access to a privileged network, etc). In this case, XKCD 1200 (https://xkcd.com/1200/) applies: attackers can already steal all the valuable goods using the application's user and do no need to escalate local privileges.

Second scenario: the machine is multi-purpose and spans multiple security domains. In this case, keep in mind the Linux kernel is a sieve when it comes to local privilege escalations and you need to use hypervisor-level isolation (separate VMs) anyway, and then you're back to single-purpose VMs where every individual workload can happily be root in its VM and do away with the cargo cult.

ramses0 · 2025-11-22T23:27:16 1763854036

There was some great lwn commentary a while back about Linux permissions being borked in the modern era... that mount-level (instead of mixed-file-level) was a better modern model.

Maybe something like bsd's "pledge" where user-invoked processes don't get all capabilities automatically?

Linux has been too "high trust" for a while now, and I don't know what the appetite is for us all digging out of it is...

Nextgrid · 2025-11-23T00:11:07 1763856667

There are two issues - one is that the permission model of Linux may not be suitable for modern workloads, but the second is that Linux is a huge, constantly-moving beast written in a memory-unsafe language and has regular privilege escalation exploits. Addressing the former still won’t address the latter.

Hypervisor-based security seems to be the least worst way to deal with this problem currently, and indeed appears to be a successful defense given cloud providers’ bottom-lines.

Helmut10001 · 2025-11-23T07:09:07 1763881747

(author of the blog post)

I fully agree with your argument: Hypervisor isolation is the best for multi-tenant security. In a single-purpose VM, the primary threat is often the application itself. There are two primary reasons for me to use docker in a rootless namespace:

1. It narrows the attack surface & simplifies operations: Running the Docker daemon itself as root presents a high-value target. A vulnerability in the daemon (like a flaw in the API, `containerd`, `runc`, etc.) becomes an instant "game over" for the entire host. The benefits of running the daemon in a user namespace are:

    - Security: A privilege escalation vulnerability within the Docker daemon itself no longer yields root on the host. The attacker breaks out into the context of an unprivileged user (mastodon, keycloak, etc.), with no sudo rights and limited access to the filesystem.
    - Isolation: As a practical benefit, each service gets its own independent Docker daemon. If I misconfigure or crash the Docker environment for Service A, it has zero impact on Service B. This is a big advantage over a single, monolithic rootful daemon managing all containers.
    - File Ownership: It solves the persistent file permission headache. Data volumes or mounted folders are owned by the rootless service user (mastodon:mastodon) on the host filesystem, not by root, which simplifies backups, migrations, and debugging. This is actually the biggest advantage to me. I discuss this a bit in my original Mastodon post. [1]

2. A great tradeoff for resource-constrained environments: Yes, a fleet of single-purpose VMs is ideal. But it's often not feasible from a resource or cost perspective, especially in a homelab or small business environment. My stack is a compromise that layers security:

    Proxmox (Hypervisor) -> Unprivileged LXC (OS-level isolation) -> Rootless Docker (User-space isolation)

This stack allows me to run ~30 distinct services across ~10 LXCs on a single machine with an average CPU utilization of just 1-2%. Achieving this level of service density with full VMs would be impossible on the same hardware due to memory and CPU overhead.

Rootless Docker is the final layer that provides meaningful separation within the cost-effective LXC containers.

Lastly: You're right to point out that the kernel can be a sieve. No single layer is perfect. But the goal of defense in depth is to force an attacker to defeat multiple, distinct security mechanisms to achieve their goal.

One last point: This principle is so important that newer tools like Podman were designed from the ground up to be rootless by default, which I'd recommend for anyone starting fresh today.

[1]: https://du.nkel.dev/blog/2023-12-12_mastodon-docker-rootless...