I understand the rationale of having a relatively simple filesystem specification so that it can be implemented under the constraints of embedded systems, but using a filesystem that has no journal or checksums on the cheapest and flakiest of storage devices (hi, bottom-of-the-barrel SD cards and USB memory sticks) seems a recipe for disaster in terms of data resiliency.
I use ZFS on SD cards and USB memory sticks when I am able to, and the number of checksum errors are ... not rare. Even for high-quality NVMe SSDs on (lower quality) USB-NVMe bridge controllers, corruption everywhere.
The real issue that the article only hints at is Windows support. Windows only supports a small handful of filesystems (FAT*, ExFAT, NTFS, ReFS, ISO9660 & UDF - I think that's it!). So if you have to interoperate with Windows then you have to use one of those filesystems.
If you don't need to interoperate with Windows then you have many better choices.
> Windows then you have to use one of those filesystems.
I can appreciate that's true if the client systems are not under your control, but if you have influence over all systems that will touch the drives, I've had good experiences with OpenZFS which has drivers for all 3 major OSes: Linux, macOS, Windows
One just needs to be cautious not to use bleeding-edge features because the ZFS release cycles differ, but otherwise so far it has done fine for me
The system with Windows is more likely to be a company-owned computer, or a computer owned by some friend, acquaintance or colleague, on which you cannot install file system drivers, so exFAT is the best bet for compatibility.
If it is your own computer, then you either are a Windows user or you prefer other operating systems, in which case you would wipe Windows and install something else. In both cases you do not need to transfer data between operating systems, so you can use on external memories NTFS for Windows or XFS, ZFS etc. for Linux.
> If it is your own computer, then you either are a Windows user or you prefer other operating systems
That fits me, if it's inclusive or. Windows is a pretty OK desktop environment, and mostly gets the better tested mass market software. On the other hand, I prefer other operating systems for doing "real work". I don't usually need to transfer data via external storage though, but if I did, I might consider installing a filesystem driver/tool on my Windows machines to have a better time of it. OTOH, I might just use par2 or something on top of a readily available filesystem.
> Windows only supports a small handful of filesystems ( [...] If you don't need to interoperate with Windows then you have many better choices
Are the filesystems choices mare varied if you need to operate with MacOS?
Whenever my dad brings me one of his FAT32 USB drives with "issues", I find a bunch of hidden dot files and folders that were not there before and realize it's been plugged in a Mac.
I really would like to know what MacOS does to USB flash drives.
The dotfiles are for metadata that the non-Mac filesystems can't store. Coloured labels, icon position, longer filenames, creator info etc.
Windows actually does almost the same thing. Thumbs.db etc. And it also creates shadow files for those with metadata when saving to FAT32. You just don't see them on windows as they are hidden, just like dotfiles are on macOS.
The last version of windows that made thumbs.db was XP.
And what metadata do you mean? Permissions and alternate streams get purged, don't they? What else is there? You have to go out of your way to get a desktop.ini and those are just as obtrusive on NTFS.
Long file names are stored in a weird way but that's very standardized and doesn't make extra files.
On NTFS no, but Win 10 still makes shadow files on filesystems without metadata, like exFAT. Not sure about 11 actually as I don't use it much.
I forget the name but I believe it's the filename prepended or appended with an underscore.
PS: On Mac you can turn it off with a simple defaults write command. Of course it also means that metadata changes won't be persisted to such files. Not sure about Windows.
Are you sure? What metadata specifically? I've never seen anything like this from Windows.
And I think you misread my NTFS mention. That was only in the context of desktop.ini, and desktop.ini only happens if you go through a specific obscure menu.
But I can tell you that prepending filenames with ._ is a Mac thing.
There's still the "System Volume Information" folder. Whatever that is. (I use Linux, but see it on USB drives that where plugged into Windows machines.)
Actually FAT filesystems with two FAT entries can be implemented in a transaction safe way: 1) write allocations on first FAT, 2) write file data, 3) write the second FAT. This is the same as a journal that stores metadata only (what most other journaled filesystems do).
This is probably what Microsoft calls "TexFAT", but doesn't describe in any way.
Good written data (not corrupted while being written) will last a long time, probably more than any magnetic media, so data resiliency is good.
All flash memory devices are wall-to-wall error correction. It's basically math on a stick. The raw media error rates are 1% to 10% which are crazy high. That flash works at all is a miracle of linear algebra.
Flash sticks are like $2 wholesale parts. The packaging probably costs more than the product in many cases.
Every expense has been spared - I wouldn’t trust these things at all unless you are using one of the more expensive SSD devices or something certified for Windows ToGo, etc.
I've had many exfat devices become corrupt. I've never had an NTFS volume become corrupt without the physical drive failing. Perhaps that has more to do with the reliability of flash drives vs flash memory cards, but there are several software projects that allow for both FAT32 and exFAT filesystems that warn users corruption is a risk if they choose exFAT, particularly when using functions that do a large amount of IOPS over an extended period of time.
I didn't experience any failures with exFAT yet, but I had lots of failures with FAT16, FAT32 and NTFS. FATs were fixable and most data recoverable. NTFS - not at all - everything lost.
I think that implementation bugs are the cause for most errors. This is not a popular filesystem used in datacenters, so testing would be minimal by comparison.
Is that only using Microsoft's tooling, or did you try ntfs3g from a Linux boot stick? I am grateful that I've never tried to recover borked NTFS, but my mental model is those utilities have explored more "off the beaten path" than Microsoft may want to bother with. And, in the worst(?) case, you can patch the utilities to do any kind of extra debugging you may need
This is not my experience. Had many times chkdsk repair a corrupted NTFS just fine. Even when by mistake overwrote (run r/w badblocks detection) part of the partition, testdisk could recover data.
I tried using Tuxera NTFS on Mac right after I got my M1 MacBook Air, but found it to be very slow. I need to share media between Windows and Mac...and now I feel nervous about ExFat. I think there's a third party Apple file system driver for Windows (Paragon?) so maybe I'll look at that as well.
The absurdity is that often it's simplest to share via network for Windows and Mac. Paragon works pretty well but you need to have control of the local system.
Readig the spec (and I've read it a few months before reading this article), I think that based on the specification alone it's actually a good-enough FS (it's not great, but it's good enough). But in practice, it's actually hard to write a reliable driver, especially if you haven't have any experience in the FAT family. It also has a problem in code maturity, especially that FAT32 code (and even NTFS code!) in differrent systems are battle-tested and not just spec-compatible but also bug-compatible, unlike exFAT which is only realistically been released only in 2019.
P.S.: it still has that Y2100 hard limit in dates, I can't believe that MSFT haven't solved that one especially that it was created in 2001.
Historical curiosity: during (MS-)DOS times, having two FAT copies at the time could be very helpful. Malware that manipulated FATs was sometimes incorrectly implemented (for example, assuming that the FAT was either 12 or 16 bits), and in some cases, damage to the FAT could be recovered by restoring the second copy.
the point in TFA is that since the FATs aren't checksummed, determining which of the two is corrupted is a non-trivial task.
If you have a corrupt FAT, all you have is two sets of bytes that dont match... which one do you use?
FAT data determines the chain of clusters that form a file. Having the number of clusters in the chain, the file size can be approximated within one cluster size. If that file length approximation matches the directory entry for the file rounded up to the next full cluster, then the FAT data is correct.
Also, most FAT corruptions will cross-link together different chains of different files. That's a very good indication that the FAT data is wrong.
In practice, you (the program) knew which one is corrupted: usually the malware doesn't replace the header with a syntactically-valid FAT, it's unreadable.
I'm surprised that no file system has made it into mainstream which has error-correction capabilities (without RAID) such as Reed-Solomon or Fountain Codes.
Also, I'd like to see file systems with cryptographic hashes for protection instead of checksums.
Surely these would be extremely useful for long-tem storage formats?
My impression (not based on anything firm) is that basic reasonable-length CRCs are extremely unlikely to have collisions, unless someone is deliberately trying to create collisions. If I were looking for insurance I'd turn to one of the fast 128-bit non-cryptographic hashes.
I used to download movies and stuff from Usenet and I noticed that there were a lot of errors in the files even though TCP/IP uses checksums. I assumed this came from people downloading and uploading the files repeatedly and errors creeping in.
That's why these days I always use BitTorrent, because it uses cryptographic hashes (albeit the outdated SHA-1) and I'm therefore guaranteed that every bit is identical to the source.
The problem isn't the checksum algorithm, it's that TCP/IP only protects packets in-flight. The checksums aren't kept on-disk, so errors can creep in after they've made it across the network.
Well, and also TCP/IP checksums are small enough that you'll relatively frequently see errors sneak through them. But that's not likely to be what you were noticing.
The TCP checksum is especially weak even for its tiny size. And many devices that are built to alter packets recalculate the checksum even when they're not altering anything-- so if they're corrupted in the memory of bus of these devices their checksums will get fixed. :(
Back in the Windows 98 days, I was transferring my MP3 collection over my network to my other PC. For some reason on this day, my network hub started showing massive collisions (there were only 2 computers plugged into it). I figured that if there were network errors, it would retry and grab the bad parts of the file again. Nope. Corrupted every single MP3 that I had collected. Every song had transferred, but they all ended up with loud pops and clicks and buzzing. Still not sure why that happened to this day.
I think you were referring to this part with "errors creeping in", such as the Usenet storage servers probably using cheap/old drives & controllers with bit-rot & bit-flips before it got to the TCP/IP stack.
Probably also using cheap/old network equipment too that would bit-flip between when the TCP/IP checksum was recalculated. And like you said, combined with the re-publishing of corrupted files if they didn't bother to par2 correct them first (ie., human errors creeping in :)
I very much agree that we should do better, and it's baffling that we're not.
> Surely these would be extremely useful for long-tem storage formats?
I suspect most people... well, most people just live with bit-rot, but the second best option is to use dumb storage and just stick a *.sha256 and maybe some par2 files next to the original data file on whatever storage you're using. Maybe even a detached GPG signature, if you care about that.
Another major flaw of exFat is that TRIM doesn’t seem to be supported across all OSes. I had a WD Passport SSD slow to a crawl because of that. Formatting it to NTFS and connecting it to Windows solved the issue.
I just experienced an interesting exFAT issue yesterday. Apparently it has a 2 second resolution on mtime. Got a bug report and just formatted a new partition with exFAT in windows 11 and it was unable to set modified timestamp to an odd number. Or rather i got no error when setting it, but it did get rounded up if it was odd. The wiki entry for exfat says it has 10ms resolution up from 2s in FAT, but that didn't seem to be the case for me. I thought about it being FAT and windows just lying to me but i couldn't confirm that was the case.
Looking at the specification[1], it seems ExFAT uses an old style 2 second granularity FAT timestamp, but with an extra separate single byte field to give 0-199 * 10ms extra for fine(r) grained time stamps. It's possible Windows just ignores the high resolution field? Bit odd for Microsoft's own implementation to ignore their own enhancements though.
I do have to ask the idiot question that you are sure you formatted the drive as ExFAT and not FAT32?
Is it only the file system structure or file contents corrupted? How corrupted? Did a whole file get lost, missing a whole sector, or having bit flips?
> if the implementation follows the recommended write ordering of allocation bitmap → FAT → directory entry when creating a file and the reverse when deleting a file, then even interrupted writes do not pose an issue.
The problem is that with all the caching that occurs, writes do not happen in this straightforward way unless the cache is completely flushed at each point. Even getting cached flushed is not straightforward because most drives cache and some ignore cache flushes to increase performance.
Anecdotally I have had 0 issues with exFat and it is what I use for any storage device not intended to live on one machine permanently because I can trust it to read on just about anything and 4GB file size is not a practical limitation now.
Ideally there would be a better, universal file system but I believe it is the best we have at this point.
I haven’t had any issues with data corruption personally. Your mileage may vary.
I've had problems with exFAT on macOS where both files and disks become corrupt.
I can't nail the problem down exactly, but it appears to me that if you delete files [using the macOS FS drivers] then it may also overwrite other, random blocks on the disk.
I haven't used ExFAT on macOS regularly since the OS X days, but some old notes of mine say:
1. After each “write session” to an ExFAT volume, open Terminal and type “fsck_exfat -d disk<dnumber>s<snumber>” where <dnumber> is the disk number and <snumber> is the partition number of your ExFAT volume 2. Avoid deleting files on ExFAT volumes using a Mac!
If I am not mistaken, exfat is intentionally dumb for ultra easy implementation then technical interop. Namely, duplication, checksum etc has to be built by application on top of this filesystem.
I use ZFS on SD cards and USB memory sticks when I am able to, and the number of checksum errors are ... not rare. Even for high-quality NVMe SSDs on (lower quality) USB-NVMe bridge controllers, corruption everywhere.