Hacker News new | past | comments | ask | show | jobs | submit login

Perhaps an editor could mark this (2016) despite this being the opposite of the usual custom? I've learned not to get excited when someone posts one a Dan Luu article, since it's usually something old that I've already seen. But despite the lack of a date at the top, and despite starting with references to 2010 and 2008 papers, this one is actually new!

On further thought, maybe it would better to change the title to claim it's from (2010), wait for enough people to complain, then "something", then use that momentum to convince Dan to finally put dates on his articles. Just need to figure out what that "something" should be...

--

I looked into Thread Sanitizer (libtsan) recently, and was happy to see that it's supported on recent GCC as well. Documentation is a little strange, as it's split between a Google Wiki on Github and Clang, while the source is in LLVM:

https://github.com/google/sanitizers/wiki/ThreadSanitizerCpp...

http://clang.llvm.org/docs/ThreadSanitizer.html

https://llvm.org/svn/llvm-project/llvm/trunk/lib/Transforms/...

I was spooked by this FAQ on the Google Wiki page, though:

  Q: My code with C++ exceptions does not work with tsan. 
  A: Tsan does not support C++ exceptions.
Does this mean that it does not work at all on code that is written with exceptions, or that it might have false-positives or false-negatives when exceptions actually happen at runtime?

--

For other tools, Intel has offers their "Parallel Inspector": https://software.intel.com/en-us/intel-inspector-xe. I haven't tried it, but it sounds like it would be useful for these issues: https://software.intel.com/en-us/get-started-with-inspector. Does anyone know how it compares with TSan?

--

  An example of an atomicity violation is this bug from MySQL:
  Thread 1:
    if (thd->proc_info)
      fputs(thd->proc_info, ...)
  Thread 2:
    thd->proc_info = NULL;
While definitely a concurrency bug, I'm surprised that this would happen frequently enough to create numerous bug reports unless there is also an undesired compiler optimization that's removing the "guard" in Thread 1. That is, the window of opportunity seems very small if the code is being executed as written. I didn't look at the details of the linked bug reports, but I suspect the compiler is able to reason based on something earlier that thd->proc_info must be non-null at this point, and thus has omitted the check.

If this is the case, it's possible that "Stack" would have caught this bug as well, or at least highlighted it as a place where the generated code was different than the programmer's intent. Stack is painful to install, and seems abandoned, but does catch flag some bugs that other tools miss: https://github.com/xiw/stack/

--

Does anyone know of other tools in this space? I'm still hoping there's a "silver bullet" I haven't found yet.




> While definitely a concurrency bug, I'm surprised that this would happen frequently enough to create numerous bug reports unless there is also an undesired compiler optimization that's removing the "guard" in Thread 1.

Sometimes there's some external factor that causes two threads executing on two different CPU cores to be in lockstep. I've seen a case that had over 50% chance of happening, even though the chances should have been less than 1 per million.

For example logging can cause unintended synchronization. Also kernel device drivers can cause surprising synchronization. And probably a lot of other non-obvious things.


The only effective tool I have found is Rust, a correctness checker for race conditions.


The first, practical one was Concurrent Pascal (1975) used in a number of OS's:

http://brinch-hansen.net/papers/

Later, Eiffel's SCOOP model in 90's was immune to races for a long time with researchers doing mods for better speed, deadlock detection, livelock detection, etc. It was ported to Java at one point. The research page in the link below shows they're probably still the top players in this given steady stream of results.

https://en.wikipedia.org/wiki/SCOOP_(software)

Works in combination with Eiffel's Design-by-Contract which can knock out semantic errors he mentions:

https://www.eiffel.com/values/design-by-contract/introductio...

Ada's Ravenscar also did safe concurrency. Ada 2012 and SPARK have Design-by-Contract with SPARK also proving absence of common errors in code automatically. Cyclone was a C variant that used region-based memory management and analysis to show absence of dangling pointers, etc. Rust improved on that with a better language, dynamic safety, and race-free concurrency.

So, there's been stuff resistant to concurrency problems for quite a while among people using safer languages. Rust is just the latest and most open.


Much of Cyclone was an inspiration for Rust. Digging through the Hansen papers. Thank you.


Note that Rust protects you from data races, but you can still run into problems like deadlocks.


I've used thread sanitiser on a code-base with exceptions (a while ago), It worked.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: