Hacker News new | past | comments | ask | show | jobs | submit login

It doesn't, really.

Applications can use much more domain knowledge to do the dedup, possibly with less resource usage or better speed (memory, eg. because an index to dedup with already exists, or CPU, eg. it can use existing constraints on data and doesn't need to hash all data). FS-level deduplication is... well. For classic FSes it does exist [1], but it's a pretty ugly kludge, and is pretty much inherently an off-line process that can't happen during writing the data the first time. For CoW FSes it's less of a problem, and eg. ZFS and btrfs can do it, but it's often not a exactly a great fit. (Although one could see their snapshotting features also as deduplication)

[1] through extent sharing, ext4 does this, iirc experimental in XFS. duperemove uses this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: