Yes - save to parquet. From the OP: "Why not just persist the data to disk in Ar...

mempko · on Jan 10, 2023

I opted to store feather for one particular reason. You can open it using mmap and randomly index the data without having to load it all in memory. Also the data I have isn't very compressible to begin with, so the cpu cost vs data savings of parquet don't make sense. This only makes sense in that narrow use case.

_frkl · on Jan 10, 2023

I'm doing the same. It's also quite nice for de-duplication, a lot of operations on our data happen on a column basis, and we need to assemble tables that are basically the same, except for one or two computed columns. I usually store all columns in a separate file, and assemble tables on the fly, also memory-mapped. Quite happy with being able to do that. Not sure how easy that would be with parquet.

Infernal · on Jan 11, 2023

As someone new to Arrow/columnar DB's, do you mind sharing what kind of data makes sense to use Arrow for, but isn't very compressible?