sched_yield() isn't being used for synchronization, it's to yield the calling th...

gpderetta · on Jan 1, 2020

If a spin lock ends up in a call to sched_yeld, it means that it is already outside its intended use case (uncontrnded or nearly uncontented), so it is a bit pointless to optimize this path or expect any sane scheduling behaviour. If contention is probable it is better to fall back on a futex after a few[1] busy spins.

This is what pthread_mutex (and pretty much every decent OS provided mutex does) on glibc by default.

[1] how much is 'few' of course is often application dependent.

devit · on Jan 1, 2020

You must not use sched_yield() and also must not spin indefinitely.

Instead, one should optionally spin for up to a constant number of iterations and then call sys_futex.

newnewpdro · on Jan 2, 2020

That's how you implement a userspace mutex in linux, not a spinlock.

gpderetta · on Jan 2, 2020

Sure, but sched_yield is strictly worse than using a futex (except that you'll need to check for waiters on an unlock which makes it slightly more expensive).

newnewpdro · on Jan 2, 2020

Ok, so you're arguing that userspace should use mutexes and not spinlocks.

Which I agree with, most of the time userspace spinlocks don't fit well.

But TFA is clearly comparing the two, and observing a variety of spinlock implementations with sched_yield() demonstrating an interesting positive effect on the spinlocks as tested.

gpderetta · on Jan 2, 2020

Actually no, I wouldn't make that claim; while a futex based adaptive mutex is a very good default, spinlocks can be still approriate for some applications.

What I'm saying is that if your use case is such that you expect enough contention to consider using TATAS (which is actually a pessimization in the uncontented case) and look into optimizing sched_yield, probably a spinlock is not appropriate on the first place.

Edit: hence a spinlock shouldn't bother with yield and just do a tight xchg spin (I haven't measured it inna while, but heard rumors that pause can severely harm acquire latency on very recent CPUs as it will quickly put them in a deeper power saving mode than in the past)