No-reboot patching comes to Linux 4.0

matteotom · on March 3, 2015

More info here: http://lwn.net/SubscriberLink/634649/d31644d65227ead6/

TD;DR: There are still issues to work out, and working live patching might not be in for 4.0.

> "So, while it is still possible that the kernel will have an essentially complete live-patching feature by the end of the year, it may happen rather closer to the end of the year than the developers involved might have hoped for."

justincormack · on March 3, 2015

Live patching has its serious critics, and it is also not nearly finished "I think they are fundamentally misguided in both implementation and in design, which turns them into an (unwilling) extended arm of the security theater." [1]

[1] http://lwn.net/SubscriberLink/634649/f6a2aa1f39c538b7/

azakai · on March 3, 2015

Thanks for the link! Reading that, it looks pretty pessimistic actually. There seems to be nowhere near a clear path forward for how to actually do this.

One concern is that stack traces are "best-effort" and not guaranteed to work. Isn't there a way to force them to work, by disabling some stack-related optimizations (like no-omit-frame-pointer)? I imagine there would be a little slowdown, but maybe worth it for the ability to do quick stacktrace-based live patching?

yungchin · on March 3, 2015

Bizarre to think that at the time nobody stopped Oracle from picking up Ksplice (they offered rebootless security patches for all the larger distros back in what, 2009?). I really loved their service because it meant I could keep my half million or so terminals in gnu screen open for months.

(Also, their blog was great - see eg this really scary proof-of-concept of what you can do when you know how to live-patch a kernel https://blogs.oracle.com/ksplice/entry/hosting_backdoors_in_... )

_wiv7 · on March 3, 2015

Dare I ask how having half a million terminals open in screen is useful in any way?

yungchin · on March 3, 2015

Ok, not really half a million. Maybe 20 or so :) That's still a lot of state to lose for a kernel-upgrade reboot.

I guess it's a style of working. If you need to ask, that probably means you're not the kind of person who cares about the advanced tab-grouping features in firefox either, and you probably also never have so many different paper files open on your desk that you can't find space for your laptop? I'm not sure I can explain, and certainly not defend it - you might call it "organised chaos" :)

In any case, it's all state, and it doesn't all fit in your head. And a reboot wipes it out!

kuschku · on March 3, 2015

Not necessarily, for example firefox always restores the previous state – but I for example tend to always clean everything after I finished one project, completely clean the room IRL and my folders on my system, and then start with an empty space again for the next project. Keeps the chaos manageable xD

nine_k · on March 4, 2015

Wow, you can say that a project is finished with, requires no maintenance and no bug fixing?

I wish I was that perfect! :)

vidarh · on March 4, 2015

I usually have at least a dozen separate projects going at any time, and many of them have run for years.

I couldn't imagine being able to clean up stuff like that.

btbuildem · on March 3, 2015

Do you know of any decent way to save the layout / history of each term window? Or more generally, the state of X desktop, to be able to reload later / have several "workspaces" set up for different sets of tasks?

labianchin · on March 4, 2015

There is also Tmux resurrect that you might take a look: https://github.com/tmux-plugins/tmux-resurrect

guiambros · on March 4, 2015

+1 for Tmux Resurrect. Pretty convenient.

nitrogen · on March 4, 2015

Session management has been a part of X desktops for a long time, but a few years back it was hidden from the primary UI and some (many?) apps stopped supporting it. You can still find the options buried somewhere in the system settings for KDE/GNOME/etc. Unfortunately it doesn't save your ssh sessions, but you can use preset tab sets for konsole to help with that.

pyre · on March 3, 2015

If you use tmux, there's tmuxinator that uses a YAML file to describe the initial setup for a tmux session.

ianlevesque · on March 4, 2015

The Mac OS X terminal also saves and restores the window position, working directory, and scroll back of each terminal.

ianlevesque · on March 4, 2015

What approach did ksplice take?

PythonicAlpha · on March 3, 2015

But the Kernel is not the only reason, a system has to be restarted. What about changed dynamic libraries? It can be a mess to fiddle out, which processes has to be restarted, so a reboot is normally the best solution for critical patches (eg. in the glibc).

hobarrera · on March 3, 2015

I agree. Especially stuff like init, or non obvious non-trivial system services (and, of course, in the case of these distros, systemd).

digi_owl · on March 3, 2015

Supposedly systemd has a "restart in place" option. But if it fails you are staring at a reboot anyways.

heinrich5991 · on March 3, 2015

Hasn't failed for me so far.

NeutronBoy · on March 4, 2015

Perhaps, but the fact that it could fail means that you have to be prepared for a server outage when you attempt it anyway. In which case... the utility of 'rebootless patching' goes down a lot, because why not just reboot it?

eru · on March 4, 2015

Because there's less downtime in the vast majority of cases?

ams6110 · on March 4, 2015

One reason to love Linux on your servers or in your data-center is that you so seldom needed to reboot it. True, critical patches require a reboot, but you could go months without rebooting.

Doesn't seem so lately. I haven't actually plotted it but seems like kernel security updates drop every few weeks these days.

riffraff · on March 4, 2015

I wonder if it might be an issue with recent kernels (in the sense of not being out for a while).

I recall running some form of "stable" red hat without rebooting for very long time years ago, while my ubuntu LTS from last year has a "System restart required" MOTD every other week.

wumpus · on March 3, 2015

Is there any solution for processes using obsolete libraries? It's a pain to figure out how to restart all of them without rebooting.

lsof | grep lib | egrep '(DEL|deleted\))'

eudemo · on March 3, 2015

If you are on Debian, you could try checkrestart

wumpus · on March 3, 2015

Ah, I'm on CentOS (RHEL), which seems to lack an equivalent.

EDIT: Note that the key feature is assistance in restarting! Which the debian gizmo does. Just detecting can be done by the lsof pipe I posted.

benjarrell · on March 3, 2015

RHEL 7 has 'needs-restarting' FWIW

ciupicri · on March 3, 2015

Run needs-restarting on Fedora.

See also https://rwmj.wordpress.com/2014/07/10/which-services-need-re...

bradfordboyle · on March 3, 2015

You could check out `needrestart`---I've used this on Ubuntu and Debian, so I can't vouch for other distros.

https://github.com/liske/needrestart

drzaiusapelord · on March 3, 2015

What does this mean on the VPS front? Will EC2, Rackspace, and Linode be able to patch Xen or KVM without rebooting all their clients now?

dezgeg · on March 4, 2015

This support is for the Linux kernel only. So if the Xen hypervisor itself needs to be updated, it's reboot time.

rodgerd · on March 3, 2015

My understanding of all the kernel splicing technologies is that it will depend heavily on the nature of the patch.

q2 · on March 3, 2015

I also feel the same.

If the patch consists of executables that are in use, then it has to give prompts to admins on the usage and to patch it later.

Another concern may be, system may start patching and meanwhile, user starts related applications and it may make whole system unstable and unusable.

I guess, before patching, users need to be stopped till patching is completed successfully.

Thaxll · on March 3, 2015

Not really since the libraries remain affected.

twic · on March 7, 2015

This seems like a bit of a smell. The desire for no-reboot patching stems from the fact that reboots are painful. But since hardware and software failure is an inevitable fact of life, it seems like it would be better to make reboots less painful, rather than try to make them less frequent.

This is the idea underlying crash-only software:

http://lwn.net/Articles/191059/

bigbugbag · on March 4, 2015

Ironic while the kernel gains the ability to patch without rebooting as the same time, systemd makes it mandatory to reboot after each update.

the_mitsuhiko · on March 4, 2015

Why does systemd make a reboot necessary?

nilved · on March 4, 2015

It doesn't, GP has forgotten about `systemctl daemon-reexec` and `kill 1`.

seccess · on March 3, 2015

This seems like it will be really useful for servers---I wonder how much it will matter for desktops though. Logging out and logging back in seems like a requirement for xorg, groups, etc. While it isn't a reboot, it still kills my windows and thus my workflow.

atonse · on March 3, 2015

I've always wondered what something like CoreOS uses to "switch" kernels in a live system - is that also something like ksplice or a different technology?

Titanous · on March 3, 2015

I believe that CoreOS can use kexec to update the kernel. kexec basically looks the same as a reboot except instead of stopping fully, and going through the BIOS/bootloader again (which can take a long time on server-grade hardware), the old kernel 'execs' the new kernel and boots userspace back up.

https://en.wikipedia.org/wiki/Kexec

rckclmbr · on March 3, 2015

How do you mean? I thought CoreOS required reboot https://coreos.com/docs/cluster-management/setup/update-stra...

jonalmeida · on March 3, 2015

I remember talking to a rep at Morgan Stanley that was explaining to me that you can perform patches on kdb as well so that you have zero downtime. Although, I believe that was implemented a bit differently where you have a compiled program and the patch that is read by an interpreter. I didn't get down to learning how it would eventually take the interpreted patches to be part of the compiled binary.

xroche · on March 4, 2015

For those interested, here's a very nice presentation on kGraft by Jiri Slaby (Suse): https://kernel-recipes.org/fr/2014/patcher-le-noyau-en-temps...

Note: the page is in French, but both the presentation and the slides are in English.

dantillberg · on March 3, 2015

This article's title really ought to include a hyphen: "No reboot patching" reads very differently from "No-reboot patching". The former suggests that this feature ("reboot patching") did not make it into Linux 4.0, while the latter accurately suggests that the feature called "no-reboot patching" is (edit: or may be) included in Linux 4.0.

sctb · on March 3, 2015

Thanks, we updated the title.