Show HN: Hyperfine – a command-line benchmarking tool

cthalupa · on Jan 21, 2018

Nifty! Step up over just using time. Unfortunately, timing things in general isn't going to be a very effective benchmark.

Without understanding what a program is doing, you don't understand what is impacting your results, and have no real knowledge on how things are going to differ when you go to use them in the "real world". Is one process faster when single threaded vs. a low core count, but another is massively parallel, and loses out until scaled higher? Are your commands testing the thing you think they're testing? What is your limiting factor? If you don't know why the results are what they are, instead of higher, you don't have a good benchmark.

http://www.brendangregg.com/activebenchmarking.html / http://www.brendangregg.com/ActiveBenchmarking/bonnie++.html

anfractuosity · on Jan 20, 2018

Nice, that looks very handy especially the analysis of multiple runs.

Just today I was playing with /proc/sys/vm/drop_caches, I'd never used it before, it makes a massive difference reading from a spinning disk!

For example to read tens of thousands of files (using 8 processes), it would take me

    real	5m33.048s

Then if I ran the command again, without flushing the cache, it'd take:

    real	0m6.502s

sharkdp · on Jan 20, 2018

Thank you for the feedback!

Yes, having a "cold" or a "warm" disk cache makes a massive difference for I/O-heavy programs. For one of my other programs, I differentiate between "cold-cache" and "warm-cache" benchmarks: https://github.com/sharkdp/fd-benchmarks

mynewtb · on Jan 20, 2018

If something like my .ssh got into disk cache, could another user read that or is it not a kernel or RAM thing?

gtirloni · on Jan 20, 2018

The page cache exists to speed up access and it's transparent to processes.

If another user/process tries to access your SSH files directly, it'll go through the traditional file permissions to determine if it has access or not. If the disk block is in the page cache AND access is allowed to that inode, then the kernel will retrieve the page from the cache and give it to the process.

To read the whole page cache, you'd need code sitting in kernel space. If something manages to load itself in the kernel space (e.g. kernel module), you have bigger problems to worry about.

anfractuosity · on Jan 20, 2018

Hmm, that's an interesting question.

I'd be surprised if you can get access to the page/disk cache without root privileges though.

I'm gonna do some more digging on that.

Edit: Maybe Meltdown, lets you access the RAM with that though I guess?

CyberShadow · on Jan 20, 2018

When given multiple commands, can it interleave executions instead of benchmarking them one after the other?

This would be useful when comparing two similar commands, as interleaving them makes it less likely that e.g. a load spike will unfavorably affect only one of them, or due to e.g. thermal throttling negatively affecting the last command.

sharkdp · on Jan 21, 2018

Great suggestion! I have opened a ticket here: https://github.com/sharkdp/hyperfine/issues/21

snvzz · on Jan 20, 2018

Tangentially related, look into rt-tests (from linux-rt) for scheduler latency tests.

See: https://wiki.archlinux.org/index.php/Realtime_kernel

The effect of the linux-rt patchset is dramatic.

jitl · on Jan 20, 2018

This looks excellent. Does it have a non-TTY/no—color mode that I could use in a CI environment?

sharkdp · on Jan 20, 2018

Thank you for the feedback!

Not yet, but I've just created a ticket here: https://github.com/sharkdp/hyperfine/issues/20

Should be easy to implement.

sushido · on Jan 20, 2018

What is a use case for command line benchmarking?

sharkdp · on Jan 20, 2018

Whenever you run 'time <command>' you could consider running 'hyperfine <command>' to get an answer that has been averaged over multiple runs.

I personally use command-line benchmarking to compare different tools. You might want to compare grep, ack, ag and ripgrep. I currently use it to profile my find-alternative fd and to compare it with find itself (https://github.com/sharkdp/fd-benchmarks).

You could also use it to find an optimal parameter setting for a command-line tool (make -j2 vs. make -j8).

Dowwie · on Jan 20, 2018

I'd use fd if it didn't require a blackbelt in regex, maybe by adding more comprehensive examples?

How could you do something like:

find . -iname "<asterisk>any<asterisk>thing<asterisk>"

HN won't let me use * so I had to tag them accordingly.

sharkdp · on Jan 21, 2018

The regex-equivalent for "anything" is usually "<dot><asterisk>" where the dot is for "any character" and the asterisk is for "any number of times (including zero)". fd does necessarily pattern-match at the beginning of the file name, so there is no need for an asterisk at the beginning and at the end of the pattern. Your example would be:

    fd 'any.*thing'

asicsp · on Jan 21, 2018

add four spaces at start of line

    find . -iname '*any*thing*'

Dowwie · on Jan 22, 2018

Great. I'm officially a user!

peterhajas · on Jan 20, 2018

Looks cool!

Could it use dtrace to measure other metrics besides time?

sharkdp · on Jan 20, 2018

Thank you for the feedback.

Hyperfine currently tracks real time (= wall-clock time), user time (= time spent in user mode) and system time (= time spent in kernel mode).

Unfortunately, I have never heard of dtrace. What kind of other metrics would you be interested in?

I believe I would like hyperfine to focus on timing-aspects.