You are saving one register at the cost of having a thread local variable that is visible to signal handlers, so none of its uses can be optimized away. Which results in things like gcc having to decorate every math instruction with code to set errno on the off chance that someone somewhere might read it (no one ever does).