Threads should always be an absolute last resort. Programmers working on concurrency, locking, etc are often solving self inflicted problems. I don't know why they do it.
edit:
I've mentioned it in the past, and I do hate harping on about mibbit ;) but I had a similar experience there - as you can imagine, a large amount of dns lookups (both forward and reverse) are required. First version used a single separate thread to do them. Had to do lots of synchronization in the callback. 2nd ver had a threadpool, which was even more complex. Still slow as hell.
Final version, and the version in use now uses async nio in the main network thread, and is as fast as fast can be. The code is far simpler/faster than the threaded version, and handles thousands of lookups with minimum of fuss. There's just an initial investment of writing dns packet+udp code (Not rocket science).
At first glance, threads often seem like the 'easy' option. Often they're not. In my example, it was far easier just to knuckle down and write a simple async dns client than constantly mess around with concurrency issues and try and get it to go fast.
The code in this article looks horrible to work with, and reminds me of how my first version looked :/
The problem, in my mind, is that many languages provide 'transparent' shared state between the threads - which can easily introduce very subtle bugs.
The simple solution is to requires any sharing of state to be made explicit (e.g., message passing). This obviously introduces some overhead, but I think it could eventually become like assembler v. higher level languages. A handful of very talented hackers will still be always be able to beat any threading abstraction by manually dealing with everything - but the abstractions will get more and more optimizations (e.g., copy-on-write variables, more static analysis) until you rarely bother looking underneath them.
Either that, or only use co-operative (not pre-emptive) multitasking via coroutines. Specifying the points at which control is allowed to switch from one to another is far easier than trying to reliably specify where it cannot and using locks.
Message-passing works quite well, but coroutines are also a viable option. The problems come from using pre-emptive threads AND shared state, because reasoning about who is modifying what (and at what stage of completion) becomes difficult. Co-routines avoid the former, message passing the latter. Which is simpler overall depends on the problem.
You can do either in Lua pretty easily, too. Doing both doesn't make much sense, though - schlepping around large data structures in message-passing can be a performance problem, but unlike coroutines, can be very easily switched to a run over multiple machines.
> The problem, in my mind, is that many languages provide 'transparent' shared state between the threads - which can easily introduce very subtle bugs.
That is the reason languages like Limbo ( http://limbo.cat-v.org ) and Erlang were invented (and CSP, obviously).
As I said "Programmers working on concurrency, locking, etc are often solving self inflicted problems."
They're moot points IMHO. Trying to make languages that deal well with threads/concurrency is a pointless exercise. What's the advantage to doing this, over just not using threads?
May I ask you why the frack you are using threads if you don't want a shared state? You know what the difference between a thread and a process is right? Maybe you should learn to use your tools before bubblewrapping them for being dangerous. Look at how Google Chrome is made. Instead of using threads, they use processes for tabs.
Because if the language uses threads instead of processes the compiler can perform all sorts of neat optimizations on message passing.
As a trivial example (an example I included in the post above yours), consider 'copy-on-write' optimizations. Thread A can pass a message containing Object O (which is 50MB) to Thread B. Naively, this would be pretty expensive - however the compiler can set things up so that a copy of O isn't made until either B or A tries to write to O (in which case in can even try to just copy the part that was mutated). This provides a healthy balance between the raw performance of fork-style threads and the safety of multiple processes. This sort of optimization wouldn't be possible just using processes since, obviously, Process B couldn't (or at least shouldn't ;-)) directly look into Process A's memory space.
I never said don't share state - I just said don't transparently share state. Make it explicit and/or make it safe. As I said, it's never going to be quite as fast as raw fork-style threading but, with a sufficiently smart compiler, it can come close - and be a lot easier.
Actually, if you fork() a process, the memory pages of the new process are shared with the old, using copy-on-write access rules. Obviously, that doesn't enable any kind of communication, but can be damn handy if you need a read-only chunk of memory that is initialised once but needs to be shared with all processes involved.
Because in many situations not sharing in process memory causes either horrible performance or horrible productivity. The scenario I have pretty often is in memory analysis of custom data structures with multiple clients accessing a server in a read mostly pattern.
I think your last resort suggestion is very much based on the notion that most of what we ever do is offloading all data access onto a SQL database. I like SQL databases, however unfashionable they may be at the moment, but they're not the solution to everything.
As memory gets cheaper it becomes more feasible to just keep everything in server memory and access it from multiple clients. You can't do that without dealing with concurrency issues.
The quality of the argument against mutlithreading is undermined quite a bit by choosing one of the most poorly written networking library functions around. If you want to do this right, use libevent or Boost.Asio asynchronous resolvers.
I do have to admit that I didn't bother to dig through the final example looking for the errors, so perhaps the author did have a better argument in there but didn't bother to pursue it more aggressively.
Threads should always be an absolute last resort. Programmers working on concurrency, locking, etc are often solving self inflicted problems. I don't know why they do it.
edit:
I've mentioned it in the past, and I do hate harping on about mibbit ;) but I had a similar experience there - as you can imagine, a large amount of dns lookups (both forward and reverse) are required. First version used a single separate thread to do them. Had to do lots of synchronization in the callback. 2nd ver had a threadpool, which was even more complex. Still slow as hell.
Final version, and the version in use now uses async nio in the main network thread, and is as fast as fast can be. The code is far simpler/faster than the threaded version, and handles thousands of lookups with minimum of fuss. There's just an initial investment of writing dns packet+udp code (Not rocket science).
At first glance, threads often seem like the 'easy' option. Often they're not. In my example, it was far easier just to knuckle down and write a simple async dns client than constantly mess around with concurrency issues and try and get it to go fast.
The code in this article looks horrible to work with, and reminds me of how my first version looked :/