I love my LTO-5 tape drive. 1.5TB on a tape, encryption, they have a write protect notch, and I add parity files to each tape to combat bitrot. The price per TB is pretty low. I keep a backup of important stuff off-site. As datacenters upgrade, I'll gladly take their hand-me-downs, when the price is right.
Why? Hard drives are as cheap as tape now: $10/TB. Tapes are sequential access media, wear out, and malfunction, whereas hard drives are random access media. I know all about StorageTek silos and 4U tape robots but still don't bother with them.
Tape is $5 per TB at retail and you can safely store data on it for at least 10 years, after which you may need to make a new copy not due to the risk of aging, but due to the risk of no longer finding new tape drives compatible with the tape format.
Your price of $10/TB might apply when HDDs are purchased in bulk, because at retail I see prices between $15/TB and $20/TB.
No HDD may be trusted to store data for more than 5 years and this duration is valid only for the more expensive models.
Assuming your price of $10/TB, one must buy at least a double HDD capacity than tape capacity, due to the short lifetime, so tape is at least 4 times cheaper. At the prices that I see at retail the difference is even greater.
The sequential transfer speed of tape is greater than that of HDDs, so archiving or retrieving many GB of data takes less time with tapes.
HDDs wear out and malfunction more frequently than tapes.
The only real disadvantage of tapes is the high cost of the tape drives, which makes tapes preferable only when more than 100 or 200 TB of data have to be stored.
> The only real disadvantage of tapes is the high cost of the tape drives
The real disadvantage of tapes is what the drive may die at any time and if you don't have another drive then you can't recover. And a replacement drive wouldn't be cheap (new) or reliable (used).
The drives are relatively cheap. I picked up an LTO-5 drive for about $150. I don't care that it's used. At that price I can buy 3 of them so that I have spares. And the next gen tape drive will always read the previous gen tapes. So it's an easy upgrade path. I'll go to LTO-6 when those drives come down a little in price on the used market. Then LTO-7 if I'm even alive lone enough to need that much storage. Realistically, I have about 30-40 years left in me and after that I don't really care what happens to my backups. They're encrypted, with parity files, so it's not unreasonable to think that these backups will last me the rest of my life.
While this is true, HDDs are much worse, because the probability that a HDD will die at any time and without warnings is many times higher than for any tape drive.
When a HDDs dies, that is far worse than when a tape drive dies, because you not only lose some money, but you also lose data, which may be priceless, unless you have been careful to have backup copies.
While there are some companies that offer data recovery services from defective HDDs, for recent HDD models such services can be very expensive, comparable with the cost of a tape drive and much more expensive than the service of copying a good tape on a HDD, which can be done when it is not possible to buy a replacement drive immediately and the data is needed urgently.
> HDD will die at any time and without warnings is many times higher than for any tape drive
It's same most of the time and LTO drive dies if you have at least any amount of dust while HDD doesn't give a fuck
> but you also lose data, which may be priceless
No difference if you sat on the tape with yours only one copy of your favourite porn.
And if you only have one copy of data then it doesn't really matter on which media it resides.
And oh, you CAN write the same data to two HDDs simultaneously with mirroring or just having a two copy jobs to a two separate hard drives, which would not only give you a physical separation but logical as well. For you to do the same with the tape - shell another $3000.
Where do you get the idea that HDDs only retain data for 5 years? The physics I learned in college suggests they will retain magnetic domains for hundreds or even thousands of years.
Yes they can fail mechanically (is this what you mean?) but you don’t necessarily lose your data.
They will fail mostly mechanically, but also the very small magnetized bits of modern HDDs will flip spontaneously at normal temperature after a much shorter time than "hundreds or even thousands of years" (because the energy needed to flip a small bit is not large enough in comparison to the thermal fluctuations to make the flipping probability negligible).
Many of these bit flips, but not all, will be corrected when the sectors are read, due to the error-correcting codes that are used in HDDs.
This is not theory, I have stored data for several years on more than 60 HDDs of various capacities from both WD and Seagate, most of them being the more expensive models with extended warranty durations, but even so, only few of the HDDs did not have any non-correctable error after several years. (Fortunately I was careful to use redundancy, so there was no data loss.)
Moreover, some of the biggest HDDs that are available now are no longer suitable for long term data storage, because in order to improve the performance they store metadata in a flash memory, which has a more limited data retention time.
After more than 5 years the complete loss of a HDD should be expected at any time, but even after 2 or 3 years a few non-correctable errors are probable.
When a HDD fails mechanically, one might pay a data recovery service, but that might have a price similar to a new HDD, so if you plan to not replace your HDDs often enough with the hope of using data recovery, it is pretty much certain that the cost will be much higher than replacing any HDD preemptively when its warranty expires.
I do archive work and have 20+ discs from the 2010 era. Mostly the first generation of PMR drives. I have never had any data degradation problems.
You can also find lots of YouTube videos of people spinning up drives from the 80s and 90s which still hold their data without problem.
More scientifically, the phenomenon you talk about is modeled by the Arrhenius equation (1), where the activation energy to flip a grain is given by KuV/KbT, where Ku is the anisotropy of the magnetic media, V is the volume of a grain, Kb is the Boltzmann constant, and T is temp in Kelvin.
HDD manufacturers engineer this ratio to be >60 (usually targeting 70-90 to be safe). Media manufacturing is imperfect, so there is a log normal distribution of grains on real-world media, but if we assume that 60 is the energy barrier for all grains, a KuV/KbT of 60 would mean it takes 362 million years for half the grains to flip, assuming an attempt frequency of 10^10.
Your math is probably right, but a modern HDD has more than 2^19 data bits.
Assuming that your computed time is right, that means that there is a 50% probability that one bit of a HDD will flip after less than a week.
Most such bit errors will be corrected when a sector is read and the controller will rewrite a bad sector with a valid value, so the bit errors will not be cumulative in normal usage.
However when the data is stored for years without powering up the HDD, the bit flips will accumulate and they may pass the threshold needed to cause an non-correctable error.
While I do not remember to have ever seen non-correctable errors on the HDDs that I have been using daily, on identical HDDs that have been stored for years without being powered up I have frequently seen both cases when the drive reported non-correctable errors and cases when the drive reported no error but the file hashes used for error detection identified corrupted files.
The older HDDs with low data capacities had much longer lifetimes, but also the perception of those claiming that data has been stored OK on them may be wrong if they have not used any means to detect the corrupted files, because even if the HDD reports no errors, that is not good enough.
One point of clarification: one bit on a classic PMR drive contains hundreds of magnetic grains. It is the grains that flip, not the bit. It would take many grain flips to affect the bit. Errors of this sort do not manifest as flipped bits per se—they manifest as a degraded signal, which the drive may or may not be able to translate to the correct bit sequence correctly. Also, the nature of ECC is (usually) that you get the correct sequence or an error. It would be unusual to get an incorrect sequence unless that is happening somewhere off-drive.
If you have a stored drive that is reporting errors, my starting assumption would be that something else is causing problems besides the platter—maybe the heads have gotten a bit of corrosion from humidity.
Because the HDD manufacturers avoid to provide the information that would be necessary to estimate with any degree of certainty the data retention time for HDDs, we cannot know for sure the causes of HDD errors during long term storage, so we can only speculate about them.
Nevertheless, the experimental facts, both from my experience during many years with many HDDs and from the reports that I have read are:
1. Immediately after the warranty of a HDD expires, the probability of mechanical failure increases a lot. I have seen several cases of HDD failures a few months after the warranty expiration, while I have never seen a failure before that (on drives that had passed the initial acceptance tests after purchase; some drives have failed the initial tests and have been replaced by the vendor).
Therefore one should never plan to store data on HDDs beyond their warranty expiration.
2. When data is stored on HDDs that are powered down for several years, one should expect a few errors (I have seen e.g. about one error per 2 to 8 TB of data), which cause either non-correctable errors or wrong corrections that corrupt the data.
The effect of such errors can be easily mitigated by storing 2 copies of each data file on 2 different HDDs.
An alternative is to introduce a controlled data redundancy, e.g. of 5% or 10%, with a program like "par2create".
That works fine against wrongly corrected sectors, but when a non-correctable error is reported, many file copy programs fail to copy any good sector following a bad sector, so one may need to write a custom script that will seek through the corrupt file and copy the good sectors, in order to get enough data from which the original file can be reconstructed.
Storing everything on 2 HDDs, preferably of different models, is the safest method, as it also guards against the case when one HDD is completely lost due to a mechanical defect.
Many years ago, when LTO-7 was new and state-of-the-art, I have bought a new tabletop tape drive (Quantum) for $3000, which is less than an Apple HMD, but much more useful for me.
After writing several hundred TB, I have achieved a decent money saving in comparison with using HDDs.
On the other hand, when someone needs to store only 100 TB or less, there is no chance to recover the cost of the tape drive, so tapes are inappropriate for such a case.
LTO-5 worked great for my use case and I find new/old-stock LTO5 tapes for about $5/TB on ebay. I verify every tape after writing, and I include about 25% to 30% parity data on each tape to combat bit rot. I don't really need to write more than 1TB at a time of data, and the rest of the room on the tape is parity data (PAR files).
The drive was super cheap, $150 used. I don't expect it to last forever, I plan to buy another tape drive as a backup and eventually upgrade to an LTO-6 drive.
After making 2 copies of all my important data, about 30TB on LTO-5 tape, I don't have that much to back up, maybe 2TB a month, but it's easy to justify buying a few more tapes every now and then. Buying 2 hard drives for redundancy is just not anywhere near as cheap as buying 2 LTO tapes for the same amount of data, even for people with less than 100TB of data.
With scanned books and Blu-ray movies, the TBs add very quickly when you do not want to degrade their quality with additional compression, which is my case, because after digitizing or ripping them I do not keep the originals, due to not having enough space for them, so I want the retained copies to have a quality as high as possible, which leads to multi-GB file sizes.