Hacker News new | past | comments | ask | show | jobs | submit login

In my multithreaded ringbuffer and barrier I lowered latency from 10,000s-100,000 of nanoseconds to under 100 by alignment to 128 bytes to stop false sharing and pinning threads to an EVEN numbered core. I think hyperthreading interferes with things.

There is a core to core visualiser here.

https://github.com/andportnoy/core-to-core-latency




"Hyperthreading" (or whatever AMD's equivalent is), to my understanding, works by having multiple instruction streams share a pipeline in a superscalar processor. So if you have 2 processes running on the same core that are dependent on each other, you stall your pipeline more because more instructions are dependent on each other.

That being said, because hyperthreaded workloads share a pipeline and a cache, there might be benefit for memory-constrained applications to pinning pairs of processes to the 2 logical cores on the same physical core if it's highly queue-like and you can process the data in similar numbers of instructions for codestream.


Why? Just disable HT in bios or run with nosmt




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: