A "web server" in Facebook's cluster is just Apache+PHP. It doesn't do any disk work. It's not reading anything from disk, as that gets forked over to memcached or MySQL. If you've got thousands of boxes in this arrangement, clearly reducing the CPU usage thru whatever means will get you more efficiency.
Even on a diskless machine, the limit still is not necessarily "CPU usage". The machine might be limited by I/O bandwidth -- both to the clients and to those memcached and MySQL servers. It might be limited by the amount of memory that must be dedicated to each connection or process, even while that process is sleeping on I/O (which is most of the time).
Most likely, the overhead of running a server is so large that the power consumption problem is all about saving servers, rather than saving CPU cycles. In which case these other variables are just as likely to be a big concern as the speed of your HTML template's WHILE loops.
I totally agree, which is why a 10x speedup is ludicrous. The C++ code would likely just accelerate the code to the point where it was network and/or memory bound. I think a 2-3x improvement is warranted given all of the things I know about the Facebook architecture.
The webserver will only do computations which data they get from other services (memcache, etc...). They will never fetch the data locally, so there is no local bottleneck except the local bandwith and the efficiency of the code, thus the cpu.
then only logical reasoning behind is cheap machines for webservers and very powerful ones for "data" servers. otherwise network bandwith is only a fraction cpu-storage one.