I don't believe their claims. Many benchmarks (including those done by ScyllaDB)...

fnord123 · on Feb 2, 2017

"it does mean that if your dataset grows, performance falls off a cliff."

Are you saying you know ScyllaDB does not handle larger datasets and Cassandra is better in this respect? Or are you saying that their benchmarks are not yet conclusive?

uluyol · on Feb 2, 2017

I am saying that when you go from fully in memory (due to having a small dataset) to having to move things to and from disk, disk increasingly becomes your bottleneck rather than memory. And disk is much slower than memory.

fnord123 · on Feb 2, 2017

I thought a main point of Cassandra was to be distributed so the working dataset could stay in memory across the cluster. And the smaller memory footprint you typically get when you're not in the JVM means more of your working dataset can be cached in memory. So I would expect superlinear speedups compared to Java for exactly the reason you describe (depending on the request distribution).

But yeah, I'm always up for pouring over more benchmarks. :)

Here are more details on benchmarks here:

https://qconsf.com/system/files/presentation-slides/avikivit...

The YCSB benchmark suite they use is the same one as used in this paper from the Cassandra homepage:

http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf