I don't believe their claims. Many benchmarks (including those done by ScyllaDB) are done badly. They'll take a database built to operate on larger than memory data (e.g. 10x) and run on a dataset that can fit entirely in memory. So whoever optimized for in memory wins. But run on an appropriately sized dataset or reduce system memory and you see little difference.
This might seem like a good thing (ScyllaDB gives you extra performance when you have the memory for it), but it does mean that if your dataset grows, performance falls off a cliff. Something to keep in mind.
"it does mean that if your dataset grows, performance falls off a cliff."
Are you saying you know ScyllaDB does not handle larger datasets and Cassandra is better in this respect? Or are you saying that their benchmarks are not yet conclusive?
I am saying that when you go from fully in memory (due to having a small dataset) to having to move things to and from disk, disk increasingly becomes your bottleneck rather than memory. And disk is much slower than memory.
I thought a main point of Cassandra was to be distributed so the working dataset could stay in memory across the cluster. And the smaller memory footprint you typically get when you're not in the JVM means more of your working dataset can be cached in memory. So I would expect superlinear speedups compared to Java for exactly the reason you describe (depending on the request distribution).
But yeah, I'm always up for pouring over more benchmarks. :)
This might seem like a good thing (ScyllaDB gives you extra performance when you have the memory for it), but it does mean that if your dataset grows, performance falls off a cliff. Something to keep in mind.