And less and less people buy new PCs with their products Some phones have Intel ...

AnonNo15 · on March 23, 2016

You forgot about server market which is Intel-dominated and growing well.

avs733 · on March 29, 2016

The way I have begun to term it to people not really familiar with technology is this:

Every time you buy a cellphone with an ARM chip, Amazon buys an Intel chip to process all the data, services, apps, and websites you access from your phone and all the data those services, apps, and websites collect from you.

beeboop · on March 23, 2016

I found this hard to believe but from a quick Google search it appears you're right - Intel desktop CPU sales have been on a downward trend since 2011.

creshal · on March 23, 2016

That's partially related to Intel running against a wall with CPU speeds around that time, and partially to the mobile boom.

An 2011 i7-2600K desktop CPU is still competitive with current generation i7 CPUs for general-purpose computing – most improvements were in specialized instruction sets like AVX or TSX (which infamously was broken on the first generation shipping it). Lower-end i3 and i5 are even worse off, because they don't get most acceleration instruction sets in the first place.

For consumers, it makes absolutely no sense to upgrade their desktop PCs unless they break.

pcwalton · on March 23, 2016

Well, don't forget the GPU. Intel Iris is much better than Intel HD.

The way I see it, this is our fault as software developers. A 2016 PC would be much faster than a 2011 PC if we as software developers made good use of SIMD and GPUs. But we don't.

r1ch · on March 23, 2016

I think it's more developers getting lazy. Why bother with SIMD and GPUs when you can write in a high level language like javascript, design with HTML / CSS, deploy your app with embedded libchromium and have a faster time to market. SSDs and fast CPUs have made efficient software somewhat of a rarity these days.

kristianp · on March 24, 2016

It not about being lazy, it's more about how can I get cross-platform GUI support that looks good without having to use C++? The answer seems to be html these days, unfortunately.

creshal · on March 25, 2016

And even Qt is shifting to using Chromium web views to supplant their native widgets.

krylon · on March 23, 2016

On most workloads typical desktop users run (there are many exceptions, of course, but in terms of numbers of people, those are in the minority), the computational speed of the CPU is not a limiting factor any more. I/O, amount of RAM and probably memory bandwidth are far more important; on a typical mid-range desktop machine running Windows, Office and some line-of-business application, I/O completely dominates, at least from what I have observed working as a sysadmin / helpdesk monkey.

TheOtherHobbes · on March 24, 2016

MS Office: yes. Content creation: no.

3D rendering, video editing, and music production all need as many cycles as you can afford, and then some.

VR and all those AI/ML technologies waiting around the corner are going to be even more greedy.

krylon · on March 24, 2016

That is true.

At work, our CAD people use Autodesk Inventor heavily, and that thing will happily gobble up all the CPU cycles one can throw at it. (It it the one example I have first-hand experience with.)

What I meant was that for most users of desktop PCs in an office environment, a faster CPU is not going to make much of a difference in overall system performance. (I might be a little sore because at work, users will sometimes complain there computer is too slow and then demand a new one with an i7, and then I have to explain to them why that is not going to help, while a RAM upgrade and an SSD are going to make a big difference.)

But you are right, there are plenty of examples where there is no such thing as "fast enough". ;-)

vanderZwan · on March 23, 2016

I blame the latter on language expressiveness more than anything else. Here's two pieces of C++ code; one "clean", one fast, taken from [0]:

    void blur(const Image &in, Image &blurred) {
        Image tmp(in.width(), in.height());
        for (int y = 0; y < in.height(); y++){
            for (int x = 0; x < in.width(); x++){
                tmp(x, y) = (in(x-1, y) + in(x, y) + in(x+1, y))/3;
            }
        }
        for (int y = 0; y < in.height(); y++){
            for (int x = 0; x < in.width(); x++){
                blurred(x, y) = (tmp(x, y-1) + tmp(x, y) + tmp(x, y+1))/3;
            }
        }
    }

The optimised-for-speed version (order of magnitude difference):

    void fast_blur(const Image &in, Image &blurred) {
        m128i one_third = _mm_set1_epi16(21846);
        #pragma omp parallel for
        for (int yTile = 0; yTile < in.height(); yTile += 32) {
            m128i a, b, c, sum, avg;
            m128i tmp[(256/8)*(32+2)];
            for (int xTile = 0; xTile < in.width(); xTile += 256) {
                m128i *tmpPtr = tmp;
                for (int y = -1; y < 32+1; y++) {
                    const uint16_t *inPtr = &(in(xTile, yTile+y));
                    for (int x = 0; x < 256; x += 8) {
                        a = _mm_loadu_si128(( m128i*)(inPtr-1));
                        b = _mm_loadu_si128(( m128i*)(inPtr+1));
                        c = _mm_load_si128(( m128i*)(inPtr));
                        sum = _mm_add_epi16(_mm_add_epi16(a, b), c);
                        avg = _mm_mulhi_epi16(sum, one_third);
                        _mm_store_si128(tmpPtr++, avg);
                        inPtr += 8;
                    }
                }
                tmpPtr = tmp;
                for (int y = 0; y < 32; y++) {
                    m128i *outPtr = ( m128i *)(&(blurred(xTile, yTile+y)));
                    for (int x = 0; x < 256; x += 8) {
                        a = _mm_load_si128(tmpPtr+(2*256)/8);
                        b = _mm_load_si128(tmpPtr+256/8);
                        c = _mm_load_si128(tmpPtr++);
                        sum = _mm_add_epi16(_mm_add_epi16(a, b), c);
                        avg = _mm_mulhi_epi16(sum, one_third);
                        _mm_store_si128(outPtr++, avg);
                    }
                }
            }
        }
    }

I don't know about you, but that looks like an error prone maintenance disaster waiting to happen.

And, just for comparison, Halide code that produces results as fast as the second code:

    Func halide_blur(Func in) {
        Func tmp, blurred;
        Var x, y, xi, yi;

        // The algorithm
        tmp(x, y) = (in(x-1, y) + in(x, y) + in(x+1, y))/3;
        blurred(x, y) = (tmp(x, y-1) + tmp(x, y) + tmp(x, y+1))/3;

        // The schedule
        blurred.tile(x, y, xi, yi, 256, 32).vectorize(xi, 8).parallel(y);
        tmp.chunk(x).vectorize(x, 8);

        return blurred;

}

(this is kind of a weird coincidence; last time I replied to you I mentioned Halide[1] as well)

[0] http://people.csail.mit.edu/jrk/halide12/halide12.pdf

[1] http://halide-lang.org/

pcwalton · on March 23, 2016

Definitely, we've failed in programming language design as well. The biggest problem is that we keep sticking with C++ :)

AKrumbach · on March 23, 2016

  (defun language-choice (developer) 
    (if (> (developer-hipness developer) (developer-experience developer)) (lang-du-jour)
      (if (developer-scared-of developer 'parenthesis)
        (c-family-language)
        (lisp-family-language))))

vanderZwan · on March 23, 2016

You work in Rust, right? How would you express the above in that language?

pcwalton · on March 23, 2016

SIMD is a work in progress, but we have the foundations laid for a much more ergonomic approach: http://huonw.github.io/blog/2015/08/simd-in-rust/

Zardoz84 · on March 23, 2016

Switch to DLang

creshal · on March 23, 2016

When graphics are a bottleneck it's usually easier and cheaper to pop an entry-level graphics card than to throw out or replace the whole computer (unless it's a laptop). 3-4 years old low-end graphics cards still beat Iris Pro.

seanp2k2 · on March 23, 2016

Yeah, even though Iris is "good enough for light gaming", integrated still really lags behind dedicated GPUs in how smooth even a desktop experience is.

beeboop · on March 23, 2016

I am actually part of that camp. I have a i3570k and recently did an entire system upgrade including changing from full tower to mini ITX. The only real problem was the selection of mini ITX motherboards for my older socket type is pretty poor. But I'm now Oculus ready with only a minor CPU overclock despite my CPU being pretty old.

dmoy · on March 23, 2016

Yea I still keep waiting to upgrade my 2600k, but at this point I think all the other parts will physically break before I need to. It's like 10-20% slower than a chip 5 years newer, big woop.

seanp2k2 · on March 23, 2016

Same here. It's hard to want to upgrade a 2500k stable at 4.6ghz on air with no overvolting. Got it right when it came out. OTOH, starting with a Radeon 6870 then later adding a second for crossfire and Bitcoin mining when that was profitable still barely give me the power to play the newest Rainbow 6 on low everything @1920x1200 while keeping >30FPS min on Win10. I'm personally waiting for the new GPUs to come later this year before upgrading, then seeing if Oculus or Vive has better game support at that time.

kevinnk · on March 23, 2016

The 2600k cost about $350 new in 2011. For the same amount of money you could get a 5820K which is about 1.5x faster than the 2600k (or to put it in your terms, the 2600k is 52% slower). Still probably not worth it, but I looked up the numbers and might as well post them.

oofabz · on March 23, 2016

If the 5820k is 1.5x faster, doesn't that mean the 2600k is 33% slower, not 52%?

kevinnk · on March 23, 2016

Oh, yes good catch. The 2600k is (12991 - 8520) / 12991 = 34% slower. That's what I get for not rereading before posting...

creshal · on March 23, 2016

Overclock by 20% aaaand done.