> *He clearly can separate the blunt stubbornness ("we do not break userland, pe...

achiang · on Sept 17, 2018

> "we do not break userland, period"

That quote is more accurately read as, "we change things all the time, including user-visible features, and very occasionally, even in breaking ways -- but only because it's impossible to know every single consumer of every single quirk in behavior, and as soon as we learn that one of our changes did in fact break userspace, then we'll change it back".

It's how the kernel community attempts to continue cleaning up decades of tech debt while maintaining the contract with userspace. Honestly, sometimes you just don't know until you try.

It's an inefficient process, but it does sound like the right outcome occurred in this case.

signed, a former kernel developer

gmueckl · on Sept 17, 2018

So the quote should rather read "don't break the actual userland out there" instead of "any conceivable userspace"?

lovich · on Sept 17, 2018

No offense but this seems like blinding fanaticism for a man who is, admitedly right 99.99999% of the other time, not perfect.

In this case however you are arguing that when he said, "we don't break userland, period" he instead meant the opposite? There was no asterix on his statement saying to read the fine print.

LukeShu · on Sept 17, 2018

What he said was (AFAICT, the exact "we don't break userland, period" wording only appears in this HN thread. Let's go with the rant that I believe the GP was referencing):

The "first rule of kernel maintenance":

    If a change results in user programs breaking, it's a bug in the
    kernel. We never EVER blame the user programs.

Then later, to drive home the point:

    WE DO NOT BREAK USERSPACE!

So, sure, "there was no asterix on his statement saying to read the fine print"--he did better than an asterisk and fine print, he made it part of the main content.

wolf550e · on Sept 17, 2018

By "userland" Linus meant actual users of the Linux kernel, not all possible theoretical users of the Linux kernel. For example, if there is a program that just uses quirks of implementation to fingerprint the kernel version, it will get broken all the time and will need updating every kernel release, but as long as people's actual servers, phones, desktops and IoT devices keep working without changes, that's not a breaking change.

KenoFischer · on Sept 17, 2018

There's a bit back story there, so as a fun historical note, let me recount it here. In 2016, there was the Dirty CoW exploit (CVE-2016-5195), which got fixed by Linus in 19be0eaff [1]. Unfortunately, in that patch he forgot about transparent huge pages (they're easy to forget ;), which caused processes to lock up if you were using FOLL_FORCE on a memory range that happened to be backed by thp. That patch was widely backported to stable kernels, so I soon noticed my processes randomly freezing on various Linux machines. With a bit of debugging, I sent in a patch to fix that (https://lore.kernel.org/patchwork/patch/748110/), though I'm not sure that Linus ever saw it (or I guess he could have assumed we were using ptrace). Nevertheless, I assume that this being a common exploit vector for DirtyCoW made him want to remove it. As a result that was of course the second time within the course of six months that this stopped working (plus Linus' fault both times), so that email was written from a bit of a "you've got to be kidding me" mindset.

As an addendum in writing this comment, I became aware that my patch to fix FOLL_FORCE with thp actually re-introduced a version of the original DirtyCoW exploit apparently known as HugeDirtyCoW (CVE-2017–1000405) [2] - or rather my patch interacted badly with a pre-existing mistake in the thp code and caused this. Oops.

[1] https://github.com/torvalds/linux/commit/19be0eaff [2] https://medium.com/bindecy/huge-dirty-cow-cve-2017-1000405-1...

LukeShu · on Sept 17, 2018

I understand "do not break userspace" differently than you do.

You seem to understand it as "do not make a change that could hypothetically break userspace." I understand it as "do not make a change that is known to break userspace."

There are hundreds, if not thousands, of little behaviors that a userspace program could hypothetically depend on. In many cases, it would be impossible to change things without breaking observed behavior.

A great many changes fall in to the bucket "visibly changes behavior, but it is unlikely that anything depends on that specific behavior." When the assumption that no userspace program is broken turns out to be false, the correct response is to revert that change, and avoid breaking the userspace (hopefully this happens in an RC, before a stable release). What's incorrect, is to insist that the userspace program shouldn't have relied on that behavior.

In the referenced rant ("Mauro, SHUT THE FUCK UP!"), the problem wasn't so much that Mauro had changed the errno used (though Linus did take issue with the new errno); the big issue was that when Mauro learned that the change broke pulseaudio, Mauro tried to argue that it was a bug in pulseaudio for relying on what the errno was.

yesbabyyes · on Sept 17, 2018

> There are hundreds, if not thousands, of little behaviors that a userspace program could hypothetically depend on. In many cases, it would be impossible to change things without breaking observed behavior.

https://xkcd.com/1172/

flubert · on Sept 17, 2018

Is there an idiomatic phrase for the opposite of "damning with faint praise"? Praising through faint damnation? Maybe the "exception that proves the rule"?

ec109685 · on Sept 17, 2018

I read it as he didn’t think anything relied on those semantics and it turned out that was incorrect.

marmot777 · on Sept 17, 2018

Stubburn isn't likely to chase away good people. Infantile lack of self control is.

tptacek · on Sept 17, 2018

¯\_(ツ)_/¯ I defer to you on this.