Hello Linux Gurus,
I am seeking divine inspiration.
I don’t understand the apparent lack of hypervisor-based kernel protections in desktop Linux. It seems there is a significant opportunity for improvement beyond the basics of KASLR, stack canaries, and shadow stacks. However, I don’t see much work in this area on Linux desktop, and people who are much smarter than me develop for the kernel every day yet have not seen fit to produce some specific advanced protections at this time that I get into below. Where is the gap in my understanding? Is this task so difficult or costly that the open source community cannot afford it?
Windows PCs, recent Macs, iPhones, and a few Android vendors such as Samsung run their kernels atop a hypervisor. This design permits introspection and enforcement of security invariants from outside or underneath the kernel. Common mitigations include protection of critical data structures such as page table entries, function pointers, or SELinux decisions to raise the bar on injecting kernel code. Hypervisor-enforced kernel integrity appears to be a popular and at least somewhat effective mitigation although it doesn’t appear to be common on desktop Linux despite its popularity with other OSs.
Meanwhile, in the desktop Linux world, users are lucky if a distribution even implements secure boot and offers signed kernels. Popular software packages often require short-circuiting this mechanism so the user can build and install kernel modules, such as NVidia and VirtualBox drivers. SELinux is uncommon, ergo root access is more or less equivalent to the kernel privileges including introduction of arbitrary code into the kernel on most installations. TPM-based disk encryption is only officially supported experimentally by Ubuntu and is usually linked to secure boot, while users are largely on their own elsewhere. Taken together, this feels like a missed opportunity to implement additional defense-in-depth.
It’s easy to put code in the kernel. I can do it in a couple of minutes for a “hello world” module. It’s really cool that I can do this, but is it a good idea? Shouldn’t somebody try and stop me?
Please insert your unsigned modules into my brain-kernel. What have I failed to understand, or why is this the design of the kernel today? Is it an intentional omission? Is it somehow contrary to the desktop Linux ethos?
You absolutely can if you want to. Xen have been around for decades, most people that do GPU passthrough also kind of technically do that with pure Linux. Xen is the closest to what Microsoft does: technically you run Hyper-V then Windows on top, which is similar to Xen and the special dom0.
But fundamentally the hard part is, the freedoms of Linux brings in an infinite combination of possible distros, kernels, modules and software. Each module is compiled for the exact version of the kernel you run. The module must be signed by the same key as the kernel, and each distro have its own set of kernels and modules. Those keys needs to be trusted by the bootloader. So when you go try to download the new NVIDIA driver directly from their site, you run into problems. And somehow this entire mess needs to link back to one source of trust at the root of the chain.
Microsoft on the other hand controls the entire OS experience, so who signs what is pretty straightforward. Windows drivers are also very portable: one driver can work from Windows Vista to 11, so it’s easy to evaluate one developer and sign their drivers. That’s just one signature. And the Microsoft root cert is preloaded on every motherboard, so it just works.
So Linux distros that do support secure boot properly, will often have to prompt the user to install their own keys (which is UX nightmare of its own), because FOSS likes to do things right by giving full control to the user. Ideally you manage your own keys, so even a developer from a distro can’t build a signed kernel/module to exploit you, you are the root of trust. That’s also a UX nightmare because average users are good a losing keys and locking themselves out.
It’s kind of a huge mess in the end, to solve problems very few users have or care about. On Linux it’s not routine to install kernel mode malware like Vanguard or EAC. We use sandboxing a lot via Flatpak and Docker and the likes. You often get your apps from your distro which you trust, or from Flathub which you also trust. The kernel is very rarely compromised, and it’s pretty easy to cleanup afterwards too. It’s just not been a problem. Users running malware on Linux is already very rare, so protecting against rogue kernel modules and the likes just isn’t in need enough for anyone to be interested in spending the time to implement it.
But as a user armed with a lot of patience, you can make it all work and you’ll be the only one in the world that can get in. Secure boot with systemd-cryptenroll using the TPM is a fairly common setup. If you’re a corporate IT person you can lock down Linux a lot with secure boot, module signing, SELinux policies and restricted executables. The tools are all there for you to do it as a user, and you get to custom tailor it specifically for your environment too! You can remove every single driver and feature you don’t need from the kernel, sign that, and have a massively reduced attack surface. Don’t need modules? Disable runtime module loading entirely. Mount
/home
noexec. If you really care about security you can make it way, way stronger than Windows with everything enabled and you don’t even need an hypervisor to do that.