I would say that you are using them incorrectly if you assume them as recoverable. You should make everything you can so that they never happen.
However, since it is still possible to have them in a place where the exiting the process is not okay, it was beneficial to add a way to recover from them. It does not mean that they are designed to be recoverable.
> this is where most of it's overhead comes from
Overhead comes from the cleaning process. If you don't clean properly, you might leak information or allocate more resources than you should.
they are still anyway designed to be recoverable as a fundamental aspect of rust
so that e.g. you web server doesn't fall over just because one request handler panics
and it still is fundamental required that any code is safe in context of panic recovery (UnwindSafe is misleading named and actually not about safety, anything has to be "safe" in context of unwind, UnwindSafe just indicates that it's guaranteed to have sensible instead of just sound behavior in rust)
people over obsessing with panic should be unrecoverable is currently IMHO on of the biggest problems in rust not in line with any of it's original design goals or how panics are in the end design to work or how rust is used in many places in production
yes they are not "exceptions" in the sense that you aren't supposed to have fine grained recoverablility, but recoverability anyway
without recoverable panics you wouldn't be able to write (web and similar) servers in a reasonable robust way in rust without impl some king of CGI like pattern, which would be really dump IMHO and is one of the more widely used rust use cases
But as long as you don't do anything unusual you are basically introducing a (potentially huge) availability risk to your service for no reason but except not liking panics.
Like it's now enough for there to be a single subtle bug causing a panic to have a potentially pretty trivially exploitable and cheap DoS attack vector. Worse this might even happen accidentally. Turning a situation where some endpoints are unavailable due to a bug into one where your servers are constantly crashing.
Sure you might gain some performance, but for many use cases this performance is to small to reason in favor of this decisions.
Now if you only have very short lived calls, and not too many parallel calls in any point in time, and anyway spread scaling across many very very small nodes it might not matter that you might kill other requests, but once that isn't the case it seems to most times be a bad decision.
It also doesn't really add security benefits, maybe outside of you having very complicated in memory state or similar which isn't shared across multiple nodes through some form of db (if it is, you anyway have state crossing panics, or in your case service restarts).
Well, no it's not just not liking panics — it's that panics can leave memory in an inconsistent state. std::sync::Mutex does poisoning, but many other mutexes don't. And beyond that, you could also panic in the middle of operating on an &mut T, while state is invalid.
Tearing down the entire process tends to be a pretty safe alternative.
yes and that's most times the better design decision
> operating on an &mut T, while state is invalid.
but you normally don't and for many use cases is in my experiment a non-issue
panic recovery isn't fine grained and passing `&mut T` data across recovery boundaries is bothersome
at the same time most of the cross request handler in-memory state is stuff like caches and similar which as long as you don't hand role them should work just fine with panic recovery
At the same time deciding to not recover some kinds of shared state and recreate them isn't hard at all, at least assuming no spaghetti code with ton's of `static` globals are used.
And sure there are some use cases where you might prefer to tear down the process, but most web-server use cases aren't anywhere close to it in my experience. I can't remember having had a single bug because of panic recovery in the last like ~8?? years of using it in production (yes a company I previous worked for started using it in production around it's 1.0 release).
EDIT: Actually I correct myself 2 bugs. One due to Muxtex poison which wouldn't have been a problem if the Muxtex didn't had poison. And another due to a certain SQL library insisting of not fixing a long standing bug by insisting on reexport a old version of a crate known to be broken which had been fixed upstream because they didn't like the way it was fixed and both not documenting that you can't use it with panic recovery and closing all related issues because "if you use panics you are dump".
However, since it is still possible to have them in a place where the exiting the process is not okay, it was beneficial to add a way to recover from them. It does not mean that they are designed to be recoverable.
> this is where most of it's overhead comes from
Overhead comes from the cleaning process. If you don't clean properly, you might leak information or allocate more resources than you should.