Profilers also lie. Learning the ways in which they are wrong is its own skillse...

maccard · on Nov 30, 2022

I've done a lot of profiling over the years, and you're right.

In my experience though, it is correct _enough_ in the vast majority of cases to be able to gain significant speedups. You're not going to be eeking cache misses out of your hot paths without understanding what you're looking at, but you will find things like "We spend more time converting to strings and back than we do actually compressing" or "we're spending 100ms of every request loading a config file from S3 that could be cached". If we eliminated the low hanging fruit in some of our most used tools, the difference would be significant.

hinkley · on Dec 1, 2022

This may be controversial but I now believe that flame charts are hurting more than helping. These charts are meant to display problems with sequential code, but they end up obfuscating problems with asynchronous code. The evolution toward async-await semantics in Javascript and other languages is 'breaking' current generation profilers.

And this is controversial, but also true: 'low-hanging fruit' is death by a thousand cuts. It's a hill climbing algorithm and as we have all known, for generations, that greedy algorithms get stuck in local maxima. And if you know anything about farming, only amateurs pick the low-hanging fruit. Real growers harvest an entire tree at a time, otherwise you waste a ton of fruit.

Going module by module instead of chopping off tall tent poles lets you achieve much better results. One of the complaints of the Premature Optimization crowd is the potential for regressions in making changes to code that already 'works'. Refactoring reduces that possibility quite a bit. Module- or Concern-Oriented optimization mops up a lot of the rest. You can get better QA fidelity when making 10 changes in one area of functionality than you can by limiting yourself to 4 but spread across the code base.

Why I think more people don't use it, 1) I seem to be the sole proponent, 2) it cuts both ways regarding Instant Gratification. You will know about large problems that you aren't fixing until later. Later answers are often better answers. And on the other edge, it also gets to a bunch of short tent poles that will never make it out of the backlog later in the project. Nobody is going to give you permission to go around making 0.5% performance improvements, and it's a lot of wear and tear to do such work off the books. This is the death by 1000 cuts failure mode. I think the last time I saw someone bragging about a < 1% improvement was the compressed pointer discussion in the V8 blog. I can't even remember the previous example. You can deliver a 16% improvement in one concern instead of the 13% you get from the biggest wins, at very little additional cost or risk. You can keep doing that quarter after quarter, for years, and at the end you've gotten 25% farther than you would have by going the 'easy' route.

25% doesn't matter until it does. Inflection points tear up your project roadmap and disrupt plans. They force (risky) architectural changes farther up in the backlog, and without the benefit of these other little changes you've done along the way, because the best performance improvements also improve code quality, making other changes easier, not harder.

ygra · on Dec 2, 2022

> Profilers also lie.

I've had the Chrome profiler tell me to spend significant time in a function that was never called at all. Not that the idea of that function being called was unusual, it certainly was a possibility that would have indicated a bug in our code. But the code was correct and the function never ran. A breakpoint or log statement in it never executed. Yet it was the top culprit in the profile. To this day I still don't quite know what happened there.

Such things also undermine the trust one has in certain tools.

hinkley · on Dec 3, 2022

I wrote an article once with a cheeky title sometime after I became aware of “Everything I Need to Know I Learned in Kindergarden” about counting things.

Basically one of the questions you should always ask is if the results match your expectations. You may find that some methods are calling the slow function that you didn’t expect, and refactoring the code to pass the answer to your business logic may save a lot more computation than making low level changes inside the function. Plus you just learn more about the architecture of the system by following this mental exercise. If ten requests are made I expect this routine to be called 51 times, so why is it being called 213?

Also when I have a smoking gun I like to pull the whole assembly out for the benchmarking phase. If you can’t reproduce the slowdown outside of the live environment it may mean the problem is split with some other place in the code. Cache poisoning, high GC overhead, lock contention, etc.

Each time you successfully pull a piece out, you get the opportunity to fix other problems in the vicinity as well, amortizing the cost of that effort. It’s also a canary for when your coworkers who are overfond of tight coupling get their mitts on the code.

jansan · on Nov 30, 2022

> Profilers also lie

Yep, it becomes obvious when the Chrome profiler tells you that the CPU has spent a sizable amount of time on a javascript comment, which it sometimes actually does.