I see this is a welcome change, particularly as (opt-in) kernel lockdown support starts to tighten down to the point where it is meaningful. MSR writes from userspace should definitely be disabled by default for locked down kernels.
For flexibility, users should be able to opt out of this (as with the rest of kernel lockdown). Sometimes I just want a holster full of foot-guns. Blanket R/W access to MSRs is that.
There are a lot of fairly powerful MSRs. Some of them might already be restricted for all I know. Here are some examples:
IA32_SYSENTER_EIP -> the instruction pointer after calling SYSENTER. Similar MSRs exist for the syscall instruction.
IA32_DS_AREA/PEBS MSRs -> DS_AREA is a pointer that points to a buffer of pointers that are written to by the perf subsystem. Manipulating DS_AREA to point to a user page should let you write ~arbitrary data to the kernel.
IA32_SPEC_CTRL -> disable some of the speculative execution mitigations.
There are less obviously powerful msrs as well, which probably let you get away with some sketchy-ness:
APIC_BASE -> you could try to get it mapped into userspace then control the APIC directly. Or you could hide particular kernel writes by moving the apic over another page (there was an attack against SMM based on this a while back).
X2APIC MSRs -> no need to move the apic base when you could just write to the apic through msrs more directly.
IA32_EFER -> Unexpected switch from long mode to protected. This can't do anything good.
MTRRs -> Changing memory types underneath the kernel (potentially restricting access to kernel memory) can't do anything good to the kernel. You could also change the caching behavior, which sounds messy.
Edit: Formatting. This is probably still unreadable on mobile.
MSRs are not your ordinary registers, they are low-level registers for configuring major settings on the CPU.
> Think performance counter MSRs, MSRs with sticky or locked bits, MSRs making major system changes like loading microcode, MTRRs, PAT configuration, TSC counter, security mitigations MSRs, you name it.
I think the only legitimate uses of MSRs in userspace today is changing power-management related settings and performance profiling, which should be managed by the kernel anyway.
I welcome this move. PaX/grsec had an option to block MSR accesses for years.
While I don't know for a fact if MSRs were used in this case, I was at a presentation a few months back where a security researcher (Kit Murdock) used under-voltaging to get data out of a secure enclave (SGX).
"Plundervolt is an attack on Intel
SGX enclaves, which are trusted execution
environments. We showed that we can get secrets
out of SGX enclaves by lowering the CPU voltage
while it is performing calculations. Plundervolt is
a problem because SGX has an attacker model
that says, “Even if you’re root, you should not
be able to look inside my encrypted area.” "
Or AMD. One can argue about specific exploits but there are plenty of serious security folks who recommend turning off lots of performance features if you really want to be safe.
I do wish there was some sort of "1 process per physical core" rule you could apply at the scheduler level to get most of the benefit with fewer risks. Or maybe isolate the handling of untrusted code like browser javascript, pdfs, etc.
Step 3: Enjoy freedom from spectre-related problems!
Optional step 4: Move everything back to Intel/AMD when management realizes why they went for multiple processes per box in the first place. Money rules everything around us.
If I'm willing to give up long pipelines and out of order execution for security I'd at least go for a Project Denver chip or something. No need to give up all the single thread performance, or multiple cores sharing a memory pool.
Or more likely a chip with a ton of ARM A5* cores on it due to availability.
EDIT: Wait, it looks like I could actually get a Drive AGX for $700 on Newegg[1] and they seem to have about 40% the single thread performance of a Ryzen[2]. So actually not out of the question if I was willing to give up more for security than I actually am.
Yes, but that doesn't help if I want a virtual core to only be available to threads belonging to the same process as the thread on the other virtual core. I suppose syscalls make the idea not work anyways.
Yup, I just wrote a blog post about recompiling the msr module and signing it with a MOK, as a way to keep undervolting functional while using SecureBoot.
It's even better - you just need to change your boot parameters.
> This behavior right now can be toggled via the msr.allow_writes= kernel module paramrter with on/off/default. Should legitimate use-cases come up where writes to MSRs from user-space are still desired, they may add the infrastructure to selectively grant/deny access to specific MSRs and ensure they are sanitized by the kernel.
Similar hardware restrictions already exists in the kernel, for example, by default the kernel restricts access to I/O memory since it's a dangerous, low-level zone, but if you really need to for some reasons (e.g. reflash your BIOS), you can boot with "iomem=relaxed" to turn it off. Treating MSR registers in the same way is very reasonable.
Is there any clarification on how this MSR whitelisting will be implemented by end users? Or is there just going to be the global "allow_writes" parameters? I'd assume too that even with that parameter set to on, programs accessing the MSRs will still need the SYS_RAWIO capability.
Timestamp counter have legitimate uses, the combination of resolution and access cost is unprecedented. I use them when I need timestamps for events which happen at 1MHz or more. On modern CPUs these counters even run at stable frequency, unaffected by scaling.
The doesn’t have anything to do with the language code was written in. Ultimately all software is assembly. This is about hardware specific registers that control the underlying hardware and can be used for things like controlling oem functionality (brightness, backlight, etc), changing CPU frequency and voltage, etc.
> This is about hardware specific registers that control the underlying hardware and can be used for things like controlling oem functionality (brightness, backlight, etc)
That sort of functionality is typically handled by the platform controller by sending commands over LPC or I2C busses. CPU MSRs are generally restricted to controlling the behavior of the CPU itself.
> changing CPU frequency and voltage, etc.
These are often controlled by MSRs, but should be under the direct control of the kernel, not userspace software.
Which also doesn't mean you can't write software that does those things (within reason and as configured by the kernel). It just means that you will have to use the appropriate kernel interfaces, as you already should have been.
My guess is that the CPU will fault if you try to directly write them from a user mode process. You could do that via the MSR kernel driver but it'd be the kernel doing it in kernel mode.
It will. RD/WRMSR are privileged instructions and will raise an illegal instruction exception if executed outside of ring 0. The kernel has always had a device-level interface to these instructions for the benefit of (properly authenticated) userspace processes.
But as a design point, this is being rolled back. There are just so many MSRs in the modern world and so many are trivially system-breaking that it's just not possible to sanely administer an interface like that.
This is good, Intel won't fix the CPU in my old Thinkpad W701ds because it's too old now for a firmware upgrade I guess. But the thing is not only more than fast enough for today's use, it also has 32GB RAM and I just got a couple new Chinese batteries for it. There's no way I'm switching from it, you can't even get a real keyboard in a laptop any more.
If operating systems start to beef up security more aggressively then I won't have to worry so much about having old microcode. OpenBSD is ahead of Linux on this too afaicr, sadly though it doesn't like the Nvidia GPU in my computer.
For flexibility, users should be able to opt out of this (as with the rest of kernel lockdown). Sometimes I just want a holster full of foot-guns. Blanket R/W access to MSRs is that.
There are a lot of fairly powerful MSRs. Some of them might already be restricted for all I know. Here are some examples:
There are less obviously powerful msrs as well, which probably let you get away with some sketchy-ness: Edit: Formatting. This is probably still unreadable on mobile.