You could refactor sshd so most network payload processing is delegated to sandboxed sub-processes. Then an RCE there has less capabilities to exploit directly. But, I think you would have to assume an RCE can cause the sub-process to produce wrong answers. So if the answers are authorization decisions, you can transitively turn those wrong answers into RCE in the normal login or remote command execution context.
But, the normal login or remote command execution is at least audited. And it might have other enforcement of which accounts or programs are permitted. A configuration disallowing root could not be bypassed by the sub-process.
You could also decide to run all user logins/commands under some more confined SE-Linux process context. Then, the actual user sessions would be sandboxed compared to the real local root user. Of course, going too far with this may interfere with the desired use cases for SSH.
That just raises the hurdle for the attacker. The attacker in this case has full control to replace any function within ssh with their own version, and the master process of sshd will always need the ability to fork and still be root on the child process before dropping privileges. I don't see any way around that. They only needed to override one function this time, but if you raise the bar they would just override more functions and still succeed.
I’m highly safety critical systems you have software (and hardware) diversity were multiple pieces of software, developed independently, have to vote on the result. Maybe highly critical pieces of Linux like the login process should be designed the same way. So that two binaries without common dependencies would need to accept the login for the user to get privileges.
Exactly how to do it (especially transparently for the user), I have no idea though. Maybe sending ssh login requests to two different sshd implementations and if they don’t do the same things (same system calls), they are both killed.
Or some kind of two step login process where the first login only gives access to the sandbox of the second login process.
But in general I assume the Linux attack surface is too big to do software diversity for all of it.
Or better, just make an ssh without any dependencies. Statically compile it, and get rid of the libssl and libsystemd and even libpam and libc's nsswitch. (I actually do this for some of my systems)
> The attacker in this case has full control to replace any function within ssh with their own version
Not true. They have this ability only for binaries that are linked to liblzma. If sshd were to be decomposed into multiple processes, not all of them would (hopefully) depend on all the libraries that the original sshd depended on.
Well, sshd doesn't depend on liblzma in the first place, but Debian and RedHat thought it would be a good idea to tie it into libsystemd for logging purposes, and patched in support. It's still pretty bad to have systemd compromised, even if ssh weren't, though. Maybe the army of pitchforks should be marching on the systemd camp. It's definitely not OpenBSD's choice of architecture, here.
It wouldn't matter in this case, since the exploit could simply rewrite the function that calls out to the unprivileged process. If you already have malicious code in your privileged parent process there's no way to recover from that.
Tell us all, please, how the starting vector of this attack would affect statically compiled dropbear binary even with systemd's libsystemd pwnage? I am very cruious about your reasoning.
The fact, that the whole reason this library is even being pulled into the sshd daemon process, is some stupid stuff like readiness notification, which itself is utterly broken on systemd, by design (and thus is forever unfixable), and makes this even more tragic.
Don't put your head into the sand, just because of the controversial nature of the topic. Systemd was VERY accommodating in this whole fiasco.
Saddest part of all this is, that we know how to to do better. At least since Bernstein, OpenBSD and supervision community (runit/s6) guys solved it. Yet somehow we see same mistakes repeated again and again.
I.e. you fork and run little helper to write, or directly write a single byte(!), to notify supervisor over supervisor provided fd. It allows you to even privseparate your notifier stuff or do all the cute SELinux magic you need.
But that would be too simple, I guess, so instead we link like 10 completely unrelated libraries into sshd, liblzma being one of them, one of the most crucial processes on the machine. To notify supervisor that it's ready. Sounds about right, linux distros (and very specific ones at that).
Sshd should be sacred, nothing more than libc and some base cryptolibs (I don't remember whether it still needs <any>ssl even) it needs.
Another great spot to break sshd is PAM, which has no place doing there either. Unfortunately it's hard dep. on most linux distros.
Maybe sshd should adopt kernel taint approach: as soon as any weird libraries (ie everything not libc and cryptolibs) are detected in sshd proces it should consider itself tainted. Maybe even seppuku itself.
The exploit could be, probably, somehow doable without systemd. But it would be much, much harder though.
Don't try to obfuscate that very fact from the discussion.
The sd-notify protocol is literally "Read socket address from environment variable, write a value to that socket". There's no need to link in libsystemd to achieve this. It's unreasonable to blame systemd for projects that choose to do so. And, in fact, upstream systemd has already changed the behaviour of libsystemd so it only dlopen()s dependencies if the consumer actually calls the relevant entry points - which would render this attack irrelevant.
> Another great spot to break sshd is PAM, which has no place doing there either. Unfortunately it's hard dep. on most linux distros.
There are many things to hate about PAM (it should clearly be a system daemon with all of the modules running out of process), but there's literally no universe where you get to claim that sshd should have nothing to do with PAM - unless you want to plug every single possible authentication mechanism into sshd upstream you're going to end up with something functionally identical.
That's an easy thing to say after the fact indeed but yes. In fact after such a disastrous backdoor I wouldn't be surprised if OpenSSH moved all code calling external libraries to unprivileged processes to make sure such an attack can never have such a dramatic effect (an auth bypass would still likely be possible, but that's still way better than a root RCE…).
At this point “All libraries could be malicious” is a threat model that must be considered for something as security critical as OpenSSH.
I don't think that's a threat model that OpenSSH should waste too much time on. Ultimately this is malicious code in the build machine compiling a critical system library. That's not reasonable to defend against.
Keep in mind that upstream didn't even link to liblzma. Debian patched it to do so. OpenSSH should defend against that too?
any one of us if we sat on the OSSH team would flip the middle finger. What code is the project supposed to write when nothing on main dyn loaded liblzma. It was brought in from a patch they don't have realistic control over.
This is a Linux problem, and the problem is systemd, which is who brought the lib into memory and init'd it.
I think the criticisms of systemd are valid but also tangential. I think Poettering himself is on one of the HN threads saying they didn't need to link to his library to accomplish what they sought to do. Lzma is also linked into a bunch of other critical stuff, including but not limited to distro package managers and the kernel itself, so if they didn't have sshd to compromise, they could have chosen another target.
So no, as Pottering claimed, sshd would not be hit by this bug except for this systemd integration.
I really don't care about "Oh, someone could have written another compromise!". What allowed for this compromise, was a direct inability for systemd to reliable do its job as an init system, necessitating a patch.
And Redhat, Fedora, Debian, Ubuntu, and endless other distros took this route, because something was required, and here we are. Something that would not be required if systemd could actually perform its job as an init system without endless work arounds.
Also see my other reply in this thread, re Redhat's patch.
I just went and read https://bugzilla.redhat.com/show_bug.cgi?id=1381997 and actually seems to me that sshd behavior is wrong, here. I agree with the S6 school of thought, i.e. that PID files are an abomination and that there should always be a chain of supervision. systemd is capable of doing that just fine. The described sshd behavior (re-execing in the existing daemon and then forking) can only work on a dumb init system that doesn't track child processes. PID files are always a race condition and should never be part of any service detection.
That said, there are dozens of ways to fix this and it really seems like RedHat chose the worst one. They could have patched sshd in the other various ways listed in that ticket, or even just patch it to exit on SIGHUP and let systemd re-launch it.
I'm not the type to go out of my way to defend systemd and their design choices. I'm just saying the severity of this scenario of a tainted library transcends some of the legit design criticisms. If you can trojan liblzma you can probably do some serious damage without systemd or sshd.
Of course you can trojan other ways, but that can only be said, in this thread, in defense of systemd.
After all, what you're saying is and has always been the case! It's like saying "Well, Ford had a design flaw in this Pinto, and sure 20 people died, but... like, cars have design flaws from time to time, so an accident like this would've happened eventually anyhow! Oh well!"
It doesn't jive in this context.
Directly speaking to this point, patched ssh was chosen for a reason. It was the lowest hanging fruit, with the greatest reward. Your speculation about other targets isn't unwarranted, but at the same time, entirely unvalidated.
Why to avoid this? Well, it is adding more systemd-specific bits and new build dependency to something that always worked well under other inits without any problems for years.
They chose the worst solution to a problem that had multiple better solutions because of a pre-existing patch was the easiest path forward. That’s exactly what I’m talking about.
It is possible to prevent libraries from patching functions in other libraries; make those VM regions unwritable, don't let anyone make them writable, and adopt PAC or similar hardware protection so the kernel can't overwrite them either.
That's already done, but in this case the attack happened in a glibc ifunc and those run before the patching protection is enabled (since an ifunc has to patch the PLT).
Sounds like libraries should only get to patch themselves.
(Some difficulty with this one though. For instance you probably have to ban running arbitrary code at load time, but you should do this anyway because it will stop people from writing C++.)
If you're running in the binary you can call mprotect(2), and even if that is blocked you can cause all kinds of mischief. The original motivation for rings of protection on i286 was so that libraries could run in a different ring from the binary (usually library in ring 2 and program in ring 3), using a call gate (a very controlled type of call) to dispatch calls from the binary to the library, which stops the binary from modifying the library and IIRC libraries from touching each other. But x86-64 got rid of the middle rings.
> If you're running in the binary you can call mprotect(2)
Darwin doesn't let you make library regions writable after dyld is finished with them. (Especially iOS where codesigning also prevents almost all other ways to get around this.)
Something like OpenBSD pledge() can also revoke access to it in general.
> But x86-64 got rid of the middle rings.
x86 is a particularly insecure architecture but there's no need for things to be that way. That's why I mentioned PAC, which prevents other processes (including the kernel) from forging pointers even if they can write to another process's memory.
Because it's a general purpose computer. Duh. The aim is to be able to arbitrary computations. Which overwriting crypto functions in sshd is a valid computation to be considered.
I don't think you should connect your general purpose computer to the internet then. Or keep any valuable data on it. Otherwise other people are going to get to perform computations on it.
But, the normal login or remote command execution is at least audited. And it might have other enforcement of which accounts or programs are permitted. A configuration disallowing root could not be bypassed by the sub-process.
You could also decide to run all user logins/commands under some more confined SE-Linux process context. Then, the actual user sessions would be sandboxed compared to the real local root user. Of course, going too far with this may interfere with the desired use cases for SSH.