Why the Windows Registry sucks technically (2010)

Someone1234 · on July 29, 2022

I really appreciate when people use technical facts to criticize something, like with this. It is a well written take-down of the implementation of the Windows Registry as of both today and 2010.

I suspect if the concept of the registry was created today it would look more like a database (e.g. Sqlite), although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout. The Registry's Key-Value pairs are 1:1 with a database table's column-values pairs, it is just the multiple tiers of organizing "folders" above that that are a difficult implementation detail.

The UNIX way is undeniably more flexible since it isn't a virtual filesystem, it is just a filesystem. The problem is that everyone invented their own configuration format to store configuration data in /etc and there's no format agnostic API to access that information (you can open it, but can you understand it?).

Both UNIX and Windows suffer from the same orphan issue wherein information can be written, the application removed, and it is unsafe to ever remove it since you may not know the author and or all consumers.

bluetomcat · on July 29, 2022

> The UNIX way is undeniably more flexible since it isn't a virtual filesystem, it is just a filesystem.

Unix simply doesn't have a "way" in that regard, other than a loose convention to put text files in "/etc". Every application comes up with its own format. Parsing that file is application-specific. Updating a value in a text file means re-writing the whole file again. It is a technically-inferior approach that has survived in time because text files are still text files.

lelanthran · on July 29, 2022

> Unix simply doesn't have a "way" in that regard, other than a loose convention to put text files in "/etc".

So? The registry doesn't have much of a way either - the actual fields are simply a loose convention.

> Every application comes up with its own format.

Same with Windows applications - one application might store an IP address as a dotted-octet string, another might store it as a single 32 bit integer.

> Parsing that file is application-specific.

Same for Windows applications; the subtree for any application is very specific to that application and almost always differs from the subtree for other applications.

> Updating a value in a text file means re-writing the whole file again.

Not a problem, when it means that breaking a value in a text file breaks only that one application. Break the registry almost always breaks something else, if not the entire system.

I'm not saying files in /etc aren't without their problems, I'm saying that all /etc problems are already present in the registry, but the registry adds a few more of its own.

mmis1000 · on July 29, 2022

> the actual fields are simply a loose convention.

Let alone some windows program end up just save a blob in it and totally ignore the typing.

Who care about the field type if I can save everything in a single blob(?

yndoendo · on July 29, 2022

Even Microsoft likes to save blobs.

[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\BrowserEmulation\ClearableListData]\UserFilter

jbverschoor · on July 29, 2022

I can’t and shouldn’t be able to write to /etc as an ordinary or guest user.

Er-readying values isn’t done usually with apps.

You can get corrupted files if two instances fight for the same file. If that’s your thing, please use mongo. Also, please provide support to all the users.

Lastly, if you messed up certain configuration or data files, your system won’t boot either on ANY system.

Sounds like you’re just one of many anti-windows people, exactly what was pointed out about people bashing the registry.

deelowe · on July 29, 2022

> I can’t and shouldn’t be able to write to /etc as an ordinary or guest user.

You're right, you shouldn't. If you are able to, your distro is very odd and I'd recommend seeking a new one.

> Sounds like you’re just one of many anti-windows people, exactly what was pointed out about people bashing the registry.

There's no need to attack people.

chasil · on July 29, 2022

This is emphatically NOT true.

If you absolutely could not write to /etc, then you would never be able to change your password.

The passwd, chsh, chfn, and other utilities allow a non-privileged user to make controlled changes to privileged files via the setuid/gid system calls.

A user can trigger controlled writes to files in /etc.

5e92cb50239222b · on July 29, 2022

> If you absolutely could not write to /etc, then you would never be able to change your password.

Since we're talking about Linux, I would refrain from using absolute clauses like "never".

systemd-homed allows you to create "portable" user profiles which contain everything pertaining to a particular user, including his password. So user profiles (including your files, etc) can be moved between computers simply by rsync'ing that directory, or putting it on a network share.

`homectl passwd` (which changes a portable user's password) does not require writable /etc.

I think I've seen a few other solutions like that.

https://wiki.archlinux.org/title/Systemd-homed

https://systemd.io/HOME_DIRECTORY/

RedShift1 · on July 29, 2022

Users can not write to /etc, those programs like passwd can because they have the suid bit on and run as root regardless of which user initiated the process.

Blikkentrekker · on July 29, 2022

It's more so that users can ask the administrator to write to `/etc/` for them.

The administrator in this case has an automated tool that handles this requæst but the actual user the writing occurs under is the root user.

mecsred · on July 29, 2022

That's like saying you've broken RSA because you can cause controlled reads. All you have to do is send the encrypted message to the intended recipient and wait for them to decrypt it.

deelowe · on July 29, 2022

That's being a bit pedantic innit?

lelanthran · on Aug 1, 2022

Maybe in OPs opinion, using `passwd` to write to `/etc/passwd` is not the same thing as using the registry API to write to the registry?

Personally, I feel it is the same thing.

lelanthran · on July 29, 2022

> I can’t and shouldn’t be able to write to /etc as an ordinary or guest user

I don't believe you can. What distribution is this?

jbverschoor · on July 29, 2022

Point about /etc is that it’s just one part of “configuration”.

There’s no /etc for users. That’s why we get all these dot.dirs. At least some apps use ~/.config/

augustk · on July 29, 2022

"$XDG_CONFIG_HOME defines the base directory relative to which user-specific configuration files should be stored. If $XDG_CONFIG_HOME is either not set or empty, a default equal to $HOME/.config should be used."

https://specifications.freedesktop.org/basedir-spec/basedir-...

hiptobecubic · on July 29, 2022

.config/ and .local/ are the de facto /etc for users, but there is the set of XDG_ environment variables that are intended to fill that role.

jbverschoor · on July 29, 2022

Well, my git, ssh, caches of what not, almost everything is in my home directory directly. It's even worse than "My Documents"

viraptor · on July 30, 2022

Git uses XDG_CONFIG_HOME https://git-scm.com/docs/git-config#Files - putting the config directly in your home is just a choice.

jbverschoor · on July 31, 2022

It doesn’t in macOS.

It’s not a choice. It’s the default. It’s like saying it’s a choice to run mongodb without authentication.

Lastly, why should I be using XDG Desktop variables on something that don’t haveA desktop environment?

rad_gruchalski · on July 29, 2022

„My Documents” isn’t the same as a home directory. Home directory in recent Windows versions is „C:\Users\CurrentUserName”. Or better: %homedrive%%homepath%.

yjftsjthsd-h · on July 29, 2022

I'm pretty sure .local is the per-user /usr, and .config is /etc

db48x · on July 29, 2022

You’re both correct; a fair number of applications put their configuration in ~/.local/share even though that wasn’t the intent.

indymike · on July 29, 2022

> Updating a value in a text file means re-writing the whole file again. It is a technically-inferior approach that has survived in time because text files are still text files.

It always seems like registries are trying to solve three different problems: configuration, status and process shared state. Registries seem to work poorly compared to text files for configuration, and work somewhat well for some level of shared state. I think to compare functionality with Linux, you would have to compare /etc, .configs and the /proc filesystem.

As for tecnically inferior, the *ix approach seems to have made the right tradeoffs, via the hand of Darwin, rather than some brilliant engineering insight. The article did a good job of explaining what's wrong with Windows' implementation of the registry. What is right with the *ix approach is that configuration files are usually read on startup, and only written to when configuration is changed (which should be infrequent). This works really well for server software, command line utilities, and an awful lot of GUI software. For some GUI software, particularly where you have really complex feature sets, we need to save state of widgets (i.e. saving the default zoom level) and text files may be problematic for this (i.e two instances of the app running), and this is where the registry really shines.

mikepurvis · on July 29, 2022

The biggest issue with etc is that it's owned by root and therefore read only from the application's point of view. So, a great fit for sysadmin-managed configuration, not so great for applications that are GUI-configurable. You end up with a tiered approach of defaults in lib, and cascaded overrides in etc and var.

cycomanic · on July 29, 2022

> The biggest issue with etc is that it's owned by root and therefore read only from the application's point of view. So, a great fit for sysadmin-managed configuration, not so great for applications that are GUI-configurable. You end up with a tiered approach of defaults in lib, and cascaded overrides in etc and var.

What do you mean? You are mixing multiple things, most of var and lib are also only root writable (unless tied to a specific user). For GUI application you generally end up with a system where etc holds the system defaults and user configuration is in $XDG_USER_CONFIG which defaults to $HOME/.config. This is the case for the vast majority (>90%) of GUI applications.

If we're are talking state, that typically ends up $HOME/.cache (if it should be temporary) or $HOME/.local/var (if it should be persistent). There are also XDG variables for these but I forgot them at the moment.

So can you elaborate what you mean with your perceived problem?

ElectricalUnion · on July 29, 2022

Most Linux GUI apps either litter your home folder with dotfiles, dotfolders (both disgusting) or they follow XDG Base Directory specifications and place configurations in ${XDG_CONFIG_HOME:-${HOME}/.config}

dklodh · on July 29, 2022

Most don't even know of XDG. The reason is, there's no central documentation or route for the userland. By Linux, we understand, the kernel and to the kernel developers, that is true. So let alone the anarchy of the userland be handled by distributions and users.

account42 · on Aug 1, 2022

https://specifications.freedesktop.org/basedir-spec/basedir-...

freedesktop.org is that central documentation. Sure, there are applications that don't implemente that spec but it is not for there not being any documentation but either because the maintainers of those applications don't care or because the applications are older than the spec and moving configuration is not always trivial.

tomcam · on July 29, 2022

I don't mind dot files or dot folders, but I'm very open to better ideas. My app works on Mac, Linux, and Windows though. Any thoughts?

ElectricalUnion · on July 29, 2022

There are several libraries that handle directory for you in the appropriate OS-specific manner.

One Rust example being https://github.com/dirs-dev/dirs-rs

> The library provides the location of these directories by leveraging the mechanisms defined by

> the XDG base directory and the XDG user directory specifications on Linux and Redox

> the Known Folder API on Windows

> the Standard Directories guidelines on macOS

5e92cb50239222b · on July 29, 2022

It's a great thing about etc. If your application `foobar` wants to write something to /etc and doesn't have root privileges (which it probably shouldn't), create a subdirectory /etc/foobar with owner `foobar:foober`, and write there all you want. Many applications do that and it works fine.

throw0101a · on July 29, 2022

Strictly speaking /etc should be relatively static, and dynamic data should go into /var, so you (or the package installation) would create /var/lib/foobar with proper ownership.

ElectricalUnion · on July 29, 2022

> Strictly speaking /etc should be relatively static

I really wish that was true but nasty stuff like wpa_supplicant/, resolv.conf (now usually a systemd-resolved managed symlink), NetworkManager/ and X11/ lives there, very mutable and often constantly changing under the hood.

vimsee · on July 29, 2022

I have not seen any applications creating a user with the applications name. How can one avoid a naming conflict? Kate (a text editor) might be installed on a system by Kate (a person). What I have noticed however is that there are daemons running as root and is used as a "middle man" whenever an application running as a normal user needs some extra privilege.

pmontra · on July 29, 2022

Desktop applications no, because they must run as the user (let's leave containers and sandboxes alone.) Servers yes, for example PostgreSQL creates the postgres user and runs as such.

silverlyra · on July 29, 2022

OpenWRT worked around this by creating UCI, its own configuration system that it uses to generate all package-specific configuration files: https://openwrt.org/docs/guide-user/base-system/uci

Configuration for all supported packages is stored in files under /etc/config in UCI's own text format, and OpenWRT's init scripts generate service-specific config files based on the UCI settings when the service is (re)started. For example, to configure Samba, users don't edit smb.conf directly, they set UCI settings like samba.workgroup, sambashare.path, etc.

There's a `uci` command-line utility to read and modify settings, and UCI forms the basis for the OpenWRT web interface.

debconf does something similar for Debian packages, though I haven't seen any packages that use debconf to completely supplant vendor configuration formats the way UCI does. https://en.wikipedia.org/wiki/Debian_configuration_system

stockerta · on July 29, 2022

I hate this fanatical love of the UNIX way. I really hate that as of today Windows is the only non UNIX OS. I trully believe that this fact set us back. No more exciting new OSes only boring unix.

ahartmetz · on July 29, 2022

Is that you, Dave Cutler?

But more seriously, what do you want from an OS? Cleaner design? Plan 9 comes to mind. QNX is very elegant, too. Some microkernel thing? Fuchsia, but some say it's inelegant and overdesigned right off the bat. Minix has some nice properties but somehow nobody but Intel seems to use it. Also, QNX again (but as the name suggests it's similar to Unix)... Distributed computing? I don't know, seems OK to run it mostly in user space.

My favorite alternatives are Plan 9, QNX and L4 (I know it's not an OS), of which QNX is the only one I have actually used. Shame about Fuchsia, I believe the negative opinions because I looked at some documentation and code before I read them and had a sort of "THAT is supposed to be Google's better OS?!" moment.

Animats · on July 29, 2022

QNX got a lot of things right, mainly in the area of interprocess communication. Interprocess communication came late to Unix/Linux, and it shows. Microservices under QNX work much better than under Linux. Interprocess communication is fast and works like a synchronous function call. Hard real time actually works with QNX. But QNX had nothing new on the database side.

kjellsbells · on July 30, 2022

My concern isnt about Windows or Linux specifically but that something is lost when the garden only grows two plants. In, say, 1980 there were several unique OS streams, each with their own quirks: UNIX, Pick, VMS, etc.

It seems strange that we are essentially trading on isotopes of OS thinking from about 2000 when Linux took the mantle of UNIX and Windows converged on the XP code.

Is this it? Is producing an OS now so incredibly expensive and hard that we are never going to try again, and just keep bolting things on to what we have? I feel a little sad at that, even if the reasoning is impeccable.

palata · on July 30, 2022

Maybe it converged, just like other industries? Say, cars or planes have not fundamentally changed in decades. But would there be other designs that would make them so much better that it would be worth changing? That's not clear to me...

MomoXenosaga · on July 30, 2022

There's no money to be made in making an OS. Back when I was a child people were literally waiting outside stores for a new Windows. Windows 11 is given away for free.

BlueTemplar · on July 30, 2022

We'll always have TempleOS ! /ducks (RIP)

stragies · on July 29, 2022

RIM (The Blackberry people) bought QNX. Do you know, if they have been good shepherds? Has that helped, or disadvantaged QNX as a viable solution proposal in the project space it had?.

ahartmetz · on July 30, 2022

I'm not well informed. AFAIK customers like Porsche and Ford (with the fairly successful Sync 3 infotainment system) are still using it. I do know that Dan Dodge, QNX founder, developer 1 and great guy (speaker of my all time favorite conference keynote), has left Blackberry. Rumor has it that he was unhappy with the new leadership and direction for QNX.

f1refly · on July 29, 2022

The people at nixos and guix are doing it differently. No longer is the configurarion just state in files, scattered who knows where that cannot be understood fully by anyone. Insteady everything is defined centrally and then neatly versioned and managed by the system. If you think the unix way of throwing files in a directory sucks, check them out!

behnamoh · on July 29, 2022

Unfortunately, Nix's documentation sucks. Plus, there's a steep learning curve to the Nix programming language. I don't understand why they couldn't just use a language instead of inventing one that is only usable within the Nix system.

Overall, really cool idea (dropped my jaw when I first saw it in action), but poorly implemented.

pongo1231 · on July 29, 2022

Wouldn't say there's a steep learning curve for the language itself, it's pretty easy to get a grasp around it imo. Here's a helpful page I used to quickly get familiar with the language: https://github.com/tazjin/nix-1p

What's rather messy about Nix is nixpkgs with its helper functions all over the place alongside pretty shallow / non-existent documentation (which is unrelated to the language). Thankfully they've started to work on that recently: https://discourse.nixos.org/t/documentation-team-flattening-...

momentoftop · on July 29, 2022

I think it's all too easy to say they should have just used an existing language. The most obvious feature of the nix language is that it's lazy, because they want their enormous collection of configuration dictionaries computed on demand. There are no mainstream languages that are lazy.

Well, there's Haskell, and I expect any Haskell programmer to be able to pick up Nix very quickly.

Guix has gone with a strict language, Scheme, but I understand they have their own monadic DSL to cope with the peculiarities of following the Nix model. So again, not something that a mainstream language can do well, and even as a Schemer, you're going to have to learn their macro language.

nomel · on July 29, 2022

This sounds incredible, assuming there's a clean way to pin versions/source of applications. It might make the fabled idea of a reproducible system possible.

momentoftop · on July 29, 2022

They're already getting decent reproducibility of their curated package set, but some of us want to pin to a particular version of this, especially when creating our own private packages. That's where I'm finding I really want pinning.

The latest release has included "flakes", promoted from an add-on to an opt-in feature and which, among other things, lets me pin my dependencies and the environment I need to build my private projects. I then don't have to worry about going out of sync with the main package set, and I can share private projects in a (hopefully) fully reproducible way, with people using different versions of nix.

https://xeiaso.net/blog/nix-flakes-1-2022-02-21

f1refly · on Aug 4, 2022

Guix makes a big deal out of being fully reproducible. It is possible to pin versions, you basically state "I want package X while the guix repository is at commit hash Y". Every dependency of the package will be built and made available at the appropriate version and everything will work out. I used this to roll back fprintd when it broke due to a broken dependency and it worked without problems.

daxvena · on July 29, 2022

I get where you're coming from, but I think the biggest pro for the "UNIX way" is that text on a filesystem extremely accessible. You don't need a specialized tool to read & modify configuration, you just need a text editor. And while there's no standard for how the data is structured, it's usually pretty easy to figure it out from context.

I think it's also really easy to underestimate all the tooling built around text. As soon as you try out some other way, you loose out on version control, diffing, grepping, and a whole bunch of other general tools built for text.

I think the only way to get around this is to build generalized tools for working with binary data as a data structure. The problem with this approach is that now you need to maintain a database that describes every possible binary format, where text editors only really have to concern themselves with ASCII & unicode to be useful for most cases.

pie_flavor · on July 29, 2022

General Linux tools built for text, in a world where everything is plain text and the shell doesn't know how to navigate anything other than a file system. In PowerShell you can cd into HKEY_LOCAL_MACHINE as easily as any drive, and Select-String is perfectly capable of searching registry keys.

The baseline is only text and files on systems that don't have anything better to offer.

cma · on July 29, 2022

Isnt there FuseFS for that? It can show the gconf registry as a file tree, allowing you to use tools like recursive diff etc. https://metacpan.org/release/LSIM/GConf-FS-0.01

Something flexible like FuseFS is sorely missing on windows, the stuff google drive had to do creating a new letter drive is pretty gnarly.

pie_flavor · on July 29, 2022

You mean https://docs.microsoft.com/en-us/windows/win32/projfs/projec...?

cma · on July 29, 2022

Interesting, I hadn't seen that. Did Google Drive go with the virtual drive letter thing mainly for backwards compatibility to older windows?

pxc · on July 29, 2022

> The baseline is only text and files on systems that don't have anything better to offer.

Also pretty much all external programs on Windows. Anything that isn't itself implemented as a PowerShell cmdlet has this problem. And it's all shell-specific.

throwaheyy · on July 30, 2022

Not really, if it’s a .NET assembly Powershell can dynamically load, call and use types from it. Not to mention first-class support for standard data interchange formats like JSON, XML etc without having to convert back/forth from text or use tools to process it.

pxc · on July 31, 2022

In a few situations, actually calling .NET code from PowerShell doesn't suck, like setting persistent environment variables.

But in tons of situations actually doing anything that way involves a ton of boilerplate, and it's very clunky. It's acceptable for automation, but it's not something that feels very good to do in an interactive shell.

Stuff like Convert-FromJson is nice, but it's not as nice as whole ecosystem there the interchange format for external programs is sort of just 'understood'. It's also not streaming— it reads in a single JSON object as a string. So you need some extra boilerplate if the expectation is that you're piping output from a command that yields a stream of data.

PowerShell is great, but an environment where streaming objects in some other format between different programs, regardless of the languages they're in or the runtimes they use, would be better and closer to a real version of 'Unix pipelines but with objects'.

guhidalg · on July 29, 2022

Windows and macOS dominate because 99.9% of users do not care about the difference between registries, file systems, text or binary tools, UNIX-like, POSIX, etc… none of these things matter to them! They just want to use a computer and get on with their life.

If you lament this as a programmer, build applications that only work on your OS of choice, make them so good or take a dependency on a feature not available with macOS or Windows so that it can’t be ported, and then you can show the world the light of the UNIX mentality or whatever.

mhink · on July 29, 2022

> Windows and macOS dominate because 99.9% of users do not care about the difference between registries, file systems, text or binary tools, UNIX-like, POSIX, etc… none of these things matter to them! They just want to use a computer and get on with their life.

The flip side of this argument is that the experience of using computers for that 99.9% of users sucks.

As software engineers, it's almost our responsibility to argue in favor of better platforms, because every day not spent fighting the platform is a day we could be building better experiences for our users. (Not to mention, every day the platform doesn't inadvertently mess up the user experience is a day of smaller support costs.)

pie_flavor · on July 29, 2022

The registry does not impact those users' experience. Needing to learn how to use a shell, does.

yndoendo · on July 29, 2022

I would disagree that it does impact users' experience because when the registry gets corrupt you have to start by installing Windows from scratch. Nothing like Windows 10 update disabling registry backups to save space on low capacity storage and finding out after the fact when attempting to recover Windows 10.

Secondly with device drivers being tied to the registry there is not a simple system upgrade of taking out the hard drive and placing it into a new computer.

Just not everyday occurrences for most but still exist.

ajolly · on July 29, 2022

Windows these days works extremely well with just taking out the hard drive and placing it into a different computer, even across different processor brands completely different hardware etc.

I personally do this all the time. At most you get an extra restart the first time the drive is in a new set of hardware and after that you're good to go.

My main Windows install is probably 8 years old at this point, it's gone between multiple motherboards.

The biggest annoyance is a few pieces of software that tie activation to the motherboard.

If you want to make it even more portable you can install Windows to a vmdk.

And if you want to get a special complicated, you can have that vmdk act as essentially a secondary variant of your main operating system complete with symlinking most of the files and application info, to both save space and so you have most of the same state across both os's, but still can play around and easily roll back any changes.

pie_flavor · on July 30, 2022

The registry does not get corrupt. This is not a thing that happens without the system files themselves being corrupted. People blame any old problem on the registry the way they blame any old problem on /etc. Nor, in the case of an actual registry problem, do you have to reinstall Windows from scratch; restoring the registry from backup is the entire point of system restore points, which are created on every update and most program installs. As a bonus, yes, I did upgrade my computer that way. This stuff is pure superstition.

dmitriid · on July 29, 2022

I've reinstalled Linux due to borked sound on update way more times than I reinstalled Windows due to corrupt registry.

And cumulatively I spent significantly less time on Linux than I did on Windows.

daxvena · on July 29, 2022

I'm not saying the UNIX philosophy is the "best" way, I'm just stating why it's valuable.

Of course most people don't care about those things, but as a developer, the main issue I see with Windows and macOS, is that they build specialized interfaces that lock you into certain ways of doing things, which may be convenient, but are difficult to migrate away from, and a pain to automate and reproduce.

I really value that the UNIX philosophy is geared more towards building simple tools that are designed to be combined with other simple tools to solve a more complex problem, and that it doesn't try to lock you into using any particular tool to solve a problem.

So no, I may not provide binaries for Windows & Mac for personal projects, but I'm not going to be openly hostile towards people who want to build things from source or contribute fixes for Windows & macOS.

marcosdumay · on July 29, 2022

What good does having a non-UNIX OS do if it insists on doing everything worse than UNIX?

The bet on "everything is an object" was a honest and competent one, but unfortunately it didn't work. "Everything is a file" works better on practice. Unfortunately, that was the last real attempt on Windows to improve things.

Yes, the lack of diversity in OSes is bad. But Windows doesn't fix it. (Android and iOS were the last large attempt of innovation there, and implemented some really good things, but it's also useless to have the OSes completely managed by large corporations that antagonize both their customers and society as a whole.)

plonk · on July 29, 2022

> The bet on "everything is an object" was a honest and competent one, but unfortunately it didn't work. "Everything is a file" works better on practice.

With the right interface, the object approach can be more pleasant to use. Administration via PowerShell is way more consistent than in Bash.

TristanBall · on July 31, 2022

In what way did it "not work"?

Unless you meant OS/2? My understanding is the original intent there was very much to make everything an object,and the UI fully composable based on that ( in the sense of being able to link things together semi-arbitrarily, like pipes, only they never got that far )

But that's based on a half remembered article from years ago, I may have that completely wrong.

Agree with the other post that PowerShell's object view of the environment is pretty nice - not always a panacea of course, but mostly highly functional and productive

ElectricalUnion · on July 29, 2022

The problem is that UNIX, while it reeks, it's good enough for most people to not care about it.

Plan 9 was supposed to be the better designed UNIX all around and it did not even dent it. It wasn't significantly better that UNIX to displace it.

http://www.catb.org/esr/writings/taoup/html/plan9.html

> We know what Unix's future used to look like. It was designed by the research group at Bell Labs that built Unix and called ‘Plan 9 from Bell Labs’.[154] Plan 9 was an attempt to do Unix over again, better.

> The long view of history may tell a different story, but in 2003 it looks like Plan 9 failed simply because it fell short of being a compelling enough improvement on Unix to displace its ancestor. Compared to Plan 9, Unix creaks and clanks and has obvious rust spots, but it gets the job done well enough to hold its position. There is a lesson here for ambitious system architects: the most dangerous enemy of a better solution is an existing codebase that is just good enough.

chasil · on July 29, 2022

I think also that BSD (and later Linux) made UNIX free, and I assume Plan 9 was not.

UNIX was also written to run on minimal hardware. Even if Multics had been open-sourced, UNIX would still have won.

pjmlp · on July 29, 2022

UNIX has been free since V6, that is why it got adopted by everyone, and then AT&T sued Berkley when they got the opportunity to actually be allowed to charge for it.

ziml77 · on July 29, 2022

I agree on this. There may be better ways to do things, but people are so sure that UNIX perfected everything 50 years ago that the alternatives don't get explored. Personally I'm not convinced that plain text is naturally better than other options, because what even constitutes plain text? We have ways of encoding our languages into bits on a disk, but without a decoder it really doesn't have any meaning. So how is that different from any other way you can encode data? Isn't the important thing that we have tools that make it easy and consistent to work with the data?

tpoacher · on July 29, 2022

Actually there are lots of exciting non unix OS.

https://distrowatch.com/dwres.php?resource=links

It's just that they don't have market share or a raison-d'etre for widespread adoption.

smm11 · on July 29, 2022

Be and Morph are worth a look.

flohofwoe · on July 29, 2022

It's interesting though that "boring UNIX" has survived, while most non-UNIX operating systems have gone the way of the Dodo ;)

marcodiego · on July 29, 2022

I'd complement it: the fact "that as of today Windows is the only non UNIX OS" AND "this fanatical love of the UNIX way" sets us back.

RajT88 · on July 29, 2022

TempleOS is radically different than either Windows or Unixy OS's.

I would actually not even compare it to them, and just say it's flat out Radical.

jabbany · on July 29, 2022

Modern Linux DEs have an arguably worse version by not only having traditional `/etc` for some types of configurations but _also_ having a "registry" that's really similar to the Windows one for GUI apps... (re: gconf for Gnome)

marcosdumay · on July 29, 2022

"Most" here means the one DE that had a Windows fan rewrite its entire config engine with the only goal of being like Windows. An action that created quite a lot of problems soon after the change.

xattt · on July 29, 2022

Is Gconf any different from OSX defaults?

arsome · on July 29, 2022

To be fair, Microsoft doesn't rely on the registry for complex configurations either, even IIS uses .config files.

layer8 · on July 29, 2022

You can’t look at HKEY_CLASSES_ROOT and claim with a straight face it’s not complex, and it’s a central piece to how a lot of things work in Windows.

JonathonW · on July 29, 2022

HKCR has a lot of entries, but the data underneath those entries isn't particularly complex.

Its only really big misstep (apart from the fundamental issues with the registry in general) is that file associations and COM registrations are all intermingled in the same namespace. But those can be and are interconnected (see how the Office file types handle their registrations, for example), so it's kind of understandable how they ended up where they did.

layer8 · on July 29, 2022

The semantics of those entries and their possible subkeys is exceedingly complex. It started out reasonably simple, but features upon features were added with each Windows release, in addition to the application-specific behaviors.

justsomehnguy · on July 29, 2022

> even IIS uses .config files.

Except that is for serving from a single file share ir a replicated site across a farm of IISes.

This has nothing with 'complex configurations'

qsort · on July 29, 2022

Is there any "way" at all that doesn't suck though? Not having "a way" in this sense is more like a feature, a single source of truth works well if it really is treated as such by all actors.

Configurations under unix are a mixture of stdin, config files, env variables and bespoke solutions.

Under Windows, you have all of the above and on top of that the registry.

Any attempt at a solution in this regard risks being a rerun of the notorious xkcd: 14 ways to specify configurations -> this should satisfy everyone -> 15 ways to specify configurations.

bluetomcat · on July 29, 2022

On a more philosophical level, the Unix convention is the "liberal" one, giving application writers more freedoms. Liberty enables more flexible applications and more chaos at the same time. A highly structured approach might be beneficial in the short run, but it might prove to be burdening beyond its due date.

_akoy · on July 29, 2022

Application writers are free to use which ever format they choose on Windows, as well. The Registry is just yet another option -- Microsoft does not force developers to use it. I don't see the Unix convention as the more "liberal" one, here. It typically has _one less_ option.

nomel · on July 29, 2022

> The Registry is just yet another option

It's not just "another option". It's a system standard option that exists, and can be easily accessed programmatically. There's no equivalent in Unix, since there is no standard, and whatever is available will depend on the distribution.

"Here's a built in option, but do whatever" is very different than "do whatever!".

_akoy · on July 29, 2022

Do you know where Microsoft calls out the Registry as the current best practice in https://docs.microsoft.com/en-us/windows/apps/?

nomel · on July 29, 2022

My point was that having something built into the system is very different than having a dependency stricken free for all that is outside of the system. For this reason, especially for the majority of Window's life, it was not just "another option", it was a "hey, this is built in" option.

As your link show, Windows has homogeneity, with good config options built into the frameworks. In those frameworks, those built in config options can't be consider just "another option". There's no real homogeneity in *nix, so the only options is misc files to put misc paths.

tfigment · on July 29, 2022

Its partly about where burden is placed. Making it easy for developers sometimes makes it hard on users and admins. And some times it becomes security issue because of varied approaches. Not sure there is right answer especially in open source where its hard to get devs.

flohofwoe · on July 29, 2022

I would actually be better if most of the data that applications store in the registry would instead live in simple text files in the AppData directory (which shouldn't be hidden by default). The registry should be restricted to Windows configuration data (like file types and their associated programs), and information that needs to be shared between installed applications.

WorldMaker · on July 29, 2022

That's been the recommendation since Windows 95. It's easy to suggest and hard to enforce.

marcodiego · on July 29, 2022

> Unix simply doesn't have a "way" in that regard

Yes. Nothing standardized in this area in UNIX/linux-land. We do have FHS, some xdg specifications and the series of configurations options that were born with dconf which resemble windows registry a bit.

MaxBarraclough · on July 29, 2022

> a loose convention to put text files in "/etc".

For user-specific configuration there's also ~, as with ~/.vimrc and ~/.bashrc. Windows has something similar in C:\Users\theuser\AppData.

edit And ~/.config/, as others have mentioned.

kalleboo · on July 29, 2022

I'm a big fan of the macOS method where there's an OS API to manipulate the user defaults (as its called) but they are actually just stored as files in a standardized format (usually XML). You get the pros of the standardization and the pros of user manipulation (like easily deleting the corrupted settings of some app by removing one or two files)

pxc · on July 29, 2022

Plasma/KDE has this with the kreadconfig5 and kwriteconfig5 commands, and it uses a simple INI-like format: https://userbase.kde.org/KDE_System_Administration/Configura...

GNOME has the gsettings command for this, which is the equivalent of the `defaults` command in macOS. Many guides still refer users to the `dconf` command, though, which is technically a lower-level tool but basically does the same thing for 99.999% of GNOME installations. (Users/distro-makers can actually choose the storage format on GNOME's case.)

LexGray · on July 29, 2022

At least that used to be the macOS method. Once Apple moved most of their apps to the cloud there are constantly open databases pushing settings updates from device to device. Things often start breaking these days if you try to fix things through direct file manipulation.

With luck you might be able to find the correct terminal commands to disable services and fingers crossed they don't automatically restart themselves before you are done editing.

throw0101a · on July 29, 2022

> I suspect if the concept of the registry was created today it would look more like a database (e.g. Sqlite), although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout.

(Open)LDAP uses a hierarchical structure and saves its data in a database:

* https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Databa...

* https://www.openldap.org/software/man.cgi?query=slapd-mdb

pjmlp · on July 29, 2022

What UNIX way?

The religious text files scattered everywhere on the FOSS clones?

The adoption of registry ideas by GNOME on gconf?

The configuration databases used by HP-UX and Aix?

The plists used by NeXTSTEP and macOS?

The settings per app used on Android (technically UNIX based at its Linux kernel)

jollybean · on July 29, 2022

Exactly this.

Unix is a bit of a mess, at least that's understandable because it's 'open'.

MS registry is a bit nutty, it should be refactored in some simple, clean way, which is apparently extremely difficult to do for big companies, especially those that consider 'backwards compatibility' to be 'everything'.

dmitriid · on July 29, 2022

> especially those that consider 'backwards compatibility' to be 'everything'.

Thank god there are companies that still think about backwards compatibility and are not intent in breaking everything because there's a shiny new thing ... ahem ... "refactoring in some simple clean way"

jollybean · on July 30, 2022

Making things clear is not some kind of 'shiny new thing' that the kids are dancing to - it's rational.

'Backwards Focus Absolutism' means we live in a world of stupid cobwebs and terrible design. The Windows Registry is a hack, it was originally designed to do something mundane and simple. It just grew, like a virus.

dmitriid · on July 30, 2022

> 'Backwards Focus Absolutism' means we live in a world of stupid cobwebs and terrible design

It also means that until ~Windows 7/10 you could run even win3.1-era apps. And you didn't run into situations like "any 32-bit apps no longer work" or "any apps not updated in more than a year will be removed" as we've seen with Apple.

zaptheimpaler · on July 30, 2022

Do you happen to know Linus' "first rule of kernel development"?

"If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. How hard can this be to understand?" - from his famous rant [1]

So Linux prioritizes backwards compat as well. It would be insane for any project relied on by half the computing world not to. MS just throwing out their registry and rewriting it would probably break a million programs and possibly break literal billions of people's computers in some way. Its a universal problem - the more dependants any piece of code has, the harder it is to change.

[1] https://lkml.org/lkml/2012/12/23/75

jollybean · on July 30, 2022

"It would be insane for any project relied on by half the computing world not to."

I think this is insane and narrow minded.

I suggest there should be 'eras' of version where things do change.

If they make OS changes, notify early, prepare basic materials, create 'auto-updaters' and allow a few years for change.

But it needs to be done.

pjmlp · on July 30, 2022

Which is a reason no one has deep knowledge about all the ioctl variants out there on Linux, as new ones get introduced, sometimes to work around backwards compatibility guarantees.

nikau · on July 30, 2022

And at least in unix you can quickly ripgrep through the usual home and /etc/ directories looking for something, whereas searching in regedit takes forever for some reason.

pjmlp · on July 30, 2022

The FOSS clones people use are stuck in a past before big boys UNIXes took over.

https://www.ibm.com/docs/en/aix/7.1?topic=subsystem-device-c...

https://docstore.mik.ua/manuals/hp-ux/en/5992-4616/ch01s01.h...

nikau · on July 31, 2022

Ahh yes the big boy UNIXes that are on life support for the few remaining stragglers willing to pay massive support contracts on legacy systems.

pjmlp · on July 29, 2022

You missed the "closed" UNIXes on my comment.

marcosdumay · on July 29, 2022

Doing that in a database is easy. You just write your keys on the "/root/path/subpath" format, and search the strings starting with some text. Most database engines have this kind of search heavily optimized. (And the ones that don't cost hundreds of thousands per core, so who cares?)

The one thing you lose by using a real database is that filesystems are only locally coherent, while databases try very hard to be globally coherent. If it has heavy access, the database way will lose performance much faster.

gregmac · on July 29, 2022

Agreed, and displaying this like a tree is then a UI concern.

I think you'd get slightly more performance if you were to store "Folder" and "Key" separately -- either three fields or using a parent-child table. You'd be able to index "Folder", and it would make building the tree structure itself easier and more efficient, and looking up all the values in a specific tree simpler.

cma · on July 29, 2022

You could do the paths in reverse (root last), to speed up any string comparisons, since prefix almost always matches and is maybe more wasted work the other way.

debevv · on July 29, 2022

The concept of a system-wide hierarchical key-value store is actually very useful is some environments. I had to develop one some time ago [1] for an embedded system, since is super useful to have a single point of truth/configuration/state, if every application agrees on using it (which is the case in such systems were every application is known in advance and developed in house)

[1] https://github.com/debevv/camellia

rwmj · on July 29, 2022

Xen (the hypervisor) had a thing called XenStore which was a system-wide key-value store with triggers: https://wiki.xenproject.org/wiki/XenStore

Someone · on July 29, 2022

> and there's no format agnostic API to access that information (you can open it, but can you understand it?).

One of the critiques in this article is that, because you have to know/guess/assume what encoding is used for various strings, there isn’t one for the registry, either.

If so, both approaches suffer from that (but Unix a bit more because it tends to store multiple items under a file system ‘key’, while Windows programs rarely store multiple items under a single registry key)

I do wonder what RegEdit.exe does here. Does it infer encoding, have a long list of key-to-encoding mappings, or a combination of the two?

rwmj · on July 29, 2022

I do wonder what RegEdit.exe does here. Does it infer encoding, have a long list of key-to-encoding mappings, or a combination of the two?

I did a bit of experimentation on this too and we think it has a heuristic to guess encodings of strings. (Which to be fair isn't a terrible idea - it's very easy and almost entirely reliable to determine if a string is ASCII/UTF-8 or UTF-16LE which are the major encodings found.)

magicalhippo · on July 29, 2022

Based on the documentation[1], it seems clear that strings are stored in Unicode (which should be UTF-16LE).

If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, and the ANSI version of this function is used (either by explicitly calling RegGetValueA or by not defining UNICODE before including the Windows.h file), this function converts the stored Unicode string to an ANSI string before copying it to the buffer pointed to by pvData.

[1]: https://docs.microsoft.com/en-us/windows/win32/api/winreg/nf...

rwmj · on July 29, 2022

The first thing to know about Microsoft documentation is it's almost always wrong.

magicalhippo · on July 29, 2022

I've been coding against the Win32 API since Windows 95. While I won't argue it's perfect, it's certainly nowhere near "almost always wrong".

Anyway, where in the hivex code do you handle these non-Unicode encoded REG_SZ values?

Btw, the comment for hivex_value_multiple_strings is based on a mistaken interpretation of the documentation. There's nothing contradictory to what MoveFileEx does[1], it simply has a list with a single entry in the case of deletions, and a list with two entries in case of renames.

The REG_MULTI_SZ documentation[2] just points out, correctly, that you can't have a zero-length string within a list of other strings (ie with at least one non-empty string after it). This is of course obvious and hence redundant, but they highlight it for novice programmers.

[1]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/n...

[2]: https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis...

pie_flavor · on July 29, 2022

This is categorically incorrect. The documentation for the function is entirely accurate. What is true is that the regedit program is more than a featureless wrapper around the API functions (unlike the first thing to know about Linux desktop software). The heuristic is performed for the user who probably doesn't know what they want; the straight exactly-as-you-asked-for-it conversion is performed for the programmer who does. (And it's only a heuristic for values; the encoding for keys is assumed from the file like TFA says, but which one it is is stored in a flag bit.)

int_19h · on July 30, 2022

To be more precise, like most of Windows, it just stores 16-bit codepoints as such, meaning that it's not necessarily always valid UTF-16LE (in the presence of invalid surrogate pairs).

samanator · on July 29, 2022

Maybe there isn't a database engine that explicitly supports file system data structures, but you could implement a filesystem in the application layer using SQLite as a storage mechanism.

Here's an example of someone doing that very thing.

https://github.com/guardianproject/libsqlfs

tauroid · on July 29, 2022

For those looking for a more up to date alternative, with a non-hardcoded DB file path, try https://github.com/greenbender/sqlfs (not affiliated, just had a look in this area a few weeks ago)

Blikkentrekker · on July 29, 2022

> The UNIX way is undeniably more flexible since it isn't a virtual filesystem, it is just a filesystem. The problem is that everyone invented their own configuration format to store configuration data in /etc and there's no format agnostic API to access that information (you can open it, but can you understand it?).

It's an interesting point and it would really be nice if there were some kind of standard, especially a standardized C library within the standard C library for configuration files, but in practice almost all of them are also easily understood `key=value` pairs that only differ by how they denote this.

Still, it would be very nice to be able to inspect and modify them by standardized tools but in practice almost any configuration file opened is obvious.

AtNightWeCode · on July 29, 2022

There is nothing wrong with a kv hierarchy for config. Used in more modern tools as well like Consul. The problem here is the poor implementation and naive security considerations.

Today, there is no broadly used file system that can replace any of this. The difference in performance is just too big.

adrian_b · on July 29, 2022

The problem of removing applications together with their configurations is trivially solved in the various Linux-based systems where any software package is installed only inside a private directory.

Any files that would be expected to be in shared directories like /usr/bin, /usr/lib or /etc, are replaced by symbolic links. Except for symbolic links, the installation of a package must not change anything outside its private directory.

In my opinion this is the only sane way of managing software packages, because otherwise, also on Linux, but especially on Windows, I have wasted far too much time during the years with debugging various problems caused by the installation/uninstallation programs.

sumtechguy · on July 29, 2022

You can do similar on windows. However, the registry was turned into trying to solve a slightly different issue of roaming users, and centrally managed settings, split by machine and user, and using what became active directory. They started off with OLE and its central store of holding name value pairs in a tree. Basically making the registry do at least 4 different things. Only one of them it does 'ok' (COM/OLE lookups). In practice it became a huge mess because thousands of applications now keep their settings in there and poor cleanup practices like you see. When like 99% of the use cases out there would be perfectly well suited to just using the forever deprecated win32 INI API to manage settings and MS just saying 'put your config files in these places for different effects'. Instead they said 'put it all in the registry'. Looking back at it, it is now 100% clear it was a bad design decision.

int_19h · on July 30, 2022

It should be noted that putting configuration into files rather than registry has been the standing recommendation on Windows for a very long time now. For example, .NET 2.0 (2005) added a standard facility for application settings, and it was implemented on top of XML .config files.

nikau · on July 30, 2022

Good to see MS practising what they preach on newly written applications:

https://docs.microsoft.com/en-us/deployedge/configure-micros...

> You can also use REGEDIT.exe on a target computer to view the registry settings that store group policy settings. These policy settings are located at this registry path: HKLM\SOFTWARE\Policies\Microsoft\Edge.

int_19h · on July 30, 2022

This is a group policy, not a setting. That makes a big difference in the right context - policies are meant for things that are typically centrally managed in enterprise environments, so the ability for the admin to remotely control every setting individually is important, and requires some kind of central system registry for OS tooling to work with.

nikau · on Aug 1, 2022

Yep, but it still sucks if you are a user trying to figure out how a setting is applied, first delve into some seemingly proprietary config db for edge, and then realise its actually configured via a registry gpo.

revolvingocelot · on July 29, 2022

>the various Linux-based systems where any software package is installed only inside a private directory

Which are these? I assume Qubes OS sidesteps the problem entirely... NixOS?

swinglock · on July 29, 2022

NixOS, mostly, but not for application data (/var usually), which is closer to the registry.

t0mek · on July 29, 2022

> although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout

Apache Jackrabbit Oak [1] implements a tree-format data model called JCR [2]. It's used as database engine for the Adobe Experience Manager (enterprise-level CMS).

[1] https://jackrabbit.apache.org/oak/docs/ [2] https://en.wikipedia.org/wiki/Content_repository_API_for_Jav...

chasil · on July 29, 2022

"The format is... endian-specific."

Wasn't there a SPARC port of Windows? Did it run on any other big-endian machines? Were the MIPS and POWER ports little-endian?

In any case, I can see why Microsoft paid SQLite for a custom set of features.

rwmj · on July 29, 2022

It's a very good question! There was also an Alpha port (both-endian and 64 bit). I've never seen a SPARC, POWER, MIPS or Alpha Windows registry so I don't know if hivex could decode them.

the_why_of_y · on July 30, 2022

Larry Osterman said in 2005: A decision was made VERY long ago that Windows would not be ported to a big-endian processor.

https://web.archive.org/web/20190108142751/https://blogs.msd...

tablespoon · on July 29, 2022

> I suspect if the concept of the registry was created today it would look more like a database (e.g. Sqlite), although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout.

I think you might mean relational database. IIRC, the registry is already a database, just not a relational one.

pram · on July 29, 2022

AIX has a registry-like database for configuration called ODM that is like you’re describing.

silvestrov · on July 29, 2022

If I was to reimplement it, I would not let any app read the DB directly.

Instead I would make an HTTP like interface where you talk to a service that provides the get/set functionality and which always use a text format like an extended JSON with native support for date, int32, etc.

This also enabled much easier and better backwards compability as you can specify app-version in the requests, so a version 5 server can respond in version 4 format.

With a service it doesn't matter if things are stored as a single sqlite DB or as multiple files.

gizmo · on July 29, 2022

You can do everything you describe with a regular function call. Introducing an unnecessary service for something as trivial as config management is even worse than the broken Registry status quo.

imtringued · on July 29, 2022

In theory all we need is a libconfig with a standardized interface and distro specific implementations.

There would be two use cases, storing hierarchical/tree like data aka JSON, YAML, etc and arbitrary graph like data with complex references.

The same could be done with the shell. libcli would be called to parse the command line arguments and your program will have an entry point that receives the parsed data.

libcli could also be used to produce structured program outputs which then can be fed into applications, possibly skipping the serialisation deserialization steps.

Since libcli will have a standard interface each distro can choose their own CLI format or whether they output their data as JSON or CBOR.

In both cases the benefit is that the choice of the data storage mechanism has been decoupled from the application.

rwmj · on July 29, 2022

Augeas does this already: https://github.com/hercules-team/augeas

kreco · on July 29, 2022

Is this sarcastic?

I mean, the versioning part sounds good, but JSON like request?

I do believe people don't know what inefficient/efficient a system can be depending on the data format they are using.

monocasa · on July 29, 2022

That's sort of already the case. Normal apps aren't really supposed to read it, the kernel hides the implementation, and get/set/etc is all system calls.

It's tooling created for security and forensics that tries to manually read it.

sumtechguy · on July 29, 2022

https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis...

Registry settings already have ACL items applied. You can see this behavior if you log in as a non admin user then try to view another users registry settings.

https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis... The thing is pretty free form with a choice of some very basic data types dword/qword and strings ascii/unicode/lists of strings.

Adding a wrapper around the existing call to go towards a json schema registry could be a helpful thing.

layer8 · on July 29, 2022

All the aspects you mention can be realized by a programmatic OS API as well. No need to sandwich a network layer and a weakly-typed data format (JSON) in between.

secondcoming · on July 29, 2022

I assume the service will run in a docker container too??

pie_flavor · on July 29, 2022

The article's complaints about the format being undocumented are ill-founded, because Microsoft repeatedly tells developers not to read the registry directly, and call the exquisitely documented functions instead. The applications that didn't listen to Microsoft and depended on particular features are most of why it hasn't updated to modern standards in the first place.

rwmj · on July 29, 2022

How's that supposed to work if you're modifying it in an offline image from a Linux host?

pie_flavor · on July 30, 2022

With DISM.

rwmj · on July 31, 2022

DISM is a proprietary Windows tool, so no use at all for modifying an offline image from a Linux host.

pie_flavor · on Aug 1, 2022

I'm sure it runs under Wine. But knowing that that's how you service the image means you have no reason to be using a Linux host to do it.

rougka · on July 29, 2022

the registry needs to be read by device drivers including the drivers for tcp and nics, including during very early boot. this would make HTTP a challenge. Also implementing a json parser in the kernel sounds like a bad idea

m_mueller · on July 30, 2022

hence I’ve always appreciated how mac OSX has done it: filesystem, but with a standard file format (plist files). Any well behaved mac app, if it misbehaves I can just go and delete the app settings plist and it’s just like a reinstall, since everything else is in a readonly bundle.

TedDoesntTalk · on July 29, 2022

Heierarcical databases support the registry format. But they are popular in the 70s?

NonNefarious · on July 29, 2022

key/value pairs

greenthrow · on July 29, 2022

I don't feel like this article is good enough to be worth resharing 12 years later.

It missed some of the key reasons why the Registry is bad and other criticisms ("We have to write exactly the bytes Windows expects." yeah of course you do. It's only meant to be changed using APIs.) are not very well thought out.

The criticism that the Registry is a janky filesystem in a single file is dismissed because you can do the same thing with ext3. That's not a good reason to dismiss it. Just because you can do a similar thing doesn't mean it is a good thing to do for the Registry. The Registry being a single file is a cause of a ton of problems, and is IMHO the root of all the other problems. If this were a better technical article it would go into that instead of hand waving it away. (E.g. it being a single file leads to the all or nothing nature of it. It necessitates the bad filesystem implementation instead of using the actual underlying filesystem, etc.)

iforgotpassword · on July 29, 2022

It sounds you misunderstood at least parts of the article. The ext3 comparison doesn't support or dismiss anything, it's to explain how there can be a filesystem in a file. They then go ahead and say how it could be a good single-file format if it were more database-like instead of filesystem-like. Other criticism of the format clearly take into account that it's a historically grown format which explains some of its idiosyncrasies, but that still doesn't mean it's not valid criticism. Especially the mentioned inconsistencies regarding data types and encodings can hardly be attributed to the registry being single-file, as you seem to suggest by saying "The Registry being a single file is a cause of a ton of problems, and is IMHO the root of all the other problems."

greenthrow · on July 29, 2022

I didn't misunderstand anything. I read this in 2010 and just re-read it.

The ext3 example is used exactly as I describe, to hand wave away "having a file system in a single file" as a criticism and just move on to talking about nitpicks with the Registry's implementation of a filesystem in a file, while ignoring that all of the problems with the registry spawn from the choice to have it be a filesystem in a file.

This article is like going to a house that burned down and saying "well yes, the house burned down. But barns also burn down. Now see one problem is that the couch is a pile of ash and cinders. These materials do not make for a good couch. Better couches use foam and fabric."

Izkata · on July 29, 2022

> while ignoring that all of the problems with the registry spawn from the choice to have it be a filesystem in a file.

The whole point of that paragraph is that the problems it has as a filesystem do not spawn from the choice of having a filesystem in a file.

jaclaz · on July 29, 2022

Actually the Registry is not a single file, it is a set of files virtually assembled into a single "object" that has many points in common with both a database and a filesystem.

Personally I see it more like a filesystem, with some striking resemblance to NTFS.

greenthrow · on July 29, 2022

It is a single file to the underlying filesystem. Yes it reimplements a filesystem, I already said that in the comment you are replying to.

helloooooooo · on July 29, 2022

It isn’t a single file. There are multiple “hives” to the registry. There is at a minimum the SECURITY, SAM, SYSTEM and BCD “hives” that are all individual files on disk. Additionally, individual UWP apps can have their own isolated hive if need be.

magicalhippo · on July 29, 2022

Related documentation: https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis...

jaclaz · on July 29, 2022

Exactly like a NTFS drive can have mountpoints that are actually other drives, the Registry on disk is made by several files, at least:

SAM

SECURITY

SYSTEM

SOFTWARE

NTUSER.DAT

and since Vista

BCD

The whatever is called the Registry when virtually assembled together is accessed as if it was a filesystem on a single file, is not in itself a "file", it is something else, or if you prefer it seems like being a monolithic file but it is backed by a number of separated files.

samstave · on July 30, 2022

Its a "view"

mkup · on July 29, 2022

It's far from being a single file. Registry is split to hives. Besides that, each hive is a bunch of files (actual hive, a pair of integrity logs - which was just one log file before Vista - and hell knows what else, some KTM-related stuff I presume):

   > dir /a C:\Windows\System32\config
   ...
   29.07.2022  12:33        44 040 192 COMPONENTS
   21.11.2010  10:21             1 024 COMPONENTS.LOG
   29.07.2022  12:33           262 144 COMPONENTS.LOG1
   14.07.2009  05:34                 0 COMPONENTS.LOG2
   29.07.2022  12:33            65 536 COMPONENTS{016888b9-6c6f-11de-8d1d001e0bcde3ec}.TM.blf
   06.03.2021  21:44           524 288 COMPONENTS{016888b9-6c6f-11de-8d1d-001e0bcde3ec}.TMContainer00000000000000000001.regtrans-ms
   29.07.2022  12:33           524 288 COMPONENTS{016888b9-6c6f-11de-8d1d-001e0bcde3ec}.TMContainer00000000000000000002.regtrans-ms

swinglock · on July 29, 2022

Much like SQLite or what have you. What specifically is problematic about that, or what would be better?

veltas · on July 29, 2022

A point being overlooked by article and commenters: the records in the registry probably use in the order of 10-100 bytes each, while every actual file takes up at least around 1024 (EDIT: maybe more like 4096?) bytes. Putting tiny 'files' into a special format is a no-brainer on the older computers that existed when the Registry was first created. I am assuming this was one of the motivations to create the Registry in the first place, to save space.

osigurdson · on July 29, 2022

Extracted from the "Rationale" section of https://en.wikipedia.org/wiki/Windows_Registry

1) Since file parsing is done much more efficiently with a binary format, it may be read from or written to more quickly than a text INI file.

2) Strongly typed data can be stored in the registry, as opposed to the text information stored in .INI files.

3) Because user-based registry settings are loaded from a user-specific path rather than from a read-only system location, the registry allows multiple users to share the same machine, and also allows programs to work for less privileged users.

4) Backup and restoration is also simplified as the registry can be accessed over a network connection for remote management/support,

5) It offers improved system integrity with features such as atomic updates.

These points are mostly bogus IMO, but apparently this was their initial rationale for implementing it.

asveikau · on July 29, 2022

Almost nobody mentions another rationale or use case: the registry is accessible from kernel mode. This makes it, in addition to the other things, kind of analogous to sysctl.

veltas · on July 29, 2022

I don't really see a reason to believe this list is comprehensive, especially looking at the sources.

lupire · on July 29, 2022

> These points are mostly bogus IMO

Why?

mananaysiempre · on July 29, 2022

On ext4, a file of 160 bytes or less (and no xattrs) can be stored inline in the inode[1], so the whole thing takes up 256 bytes (plus 8 + [length of filename] for the directory entry). I don’t think any Unix filesystem did that in 1989, but the problem is not unsolvable, especially given that you are hardly going to use a configuration file to store a single value. NTFS does the same, actually, except it can’t count that low, so you get 1K.

(Of course, before there was the registry there were the textual CONFIG.SYS and WIN.INI, but those were shared and would be easily corrupted by programs trying to modify them manually, which I suspect the registry is a reaction to.)

[1] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inli...

temac · on July 29, 2022

> On ext4, a file of 160 bytes or less (and no xattrs) can be stored inline in the inode[1], so the whole thing takes up 256 bytes (plus 8 + [length of filename] for the directory entry).

It can if you enable an option that is not widely used, IIRC.

mananaysiempre · on July 29, 2022

Huh, right you are: the inline_data feature is from 3.8 (2013), but apparently tune2fs couldn’t enable it on a live filesystem for some time after that, and it’s still not on by default.

(I remembered seeing the feature in the format description, but did not think it would actually be unused in most cases.)

jaclaz · on July 30, 2022

>NTFS does the same, actually, except it can’t count that low, so you get 1K.

Not exactly, a $MFT record (on 512 bytes/sector media) is 1024 bytes, the most you can store in it (it slightly varies depending on the length of the filename and on the exact way the file is written) is 744 bytes max.

On 4K disks, the $MFT record is 4096 bytes and allows up to 3776 bytes max.

JFYI:

https://www.forensicfocus.com/forums/general/mft-resident-da...

Tempest1981 · on July 29, 2022

I'm trying to remember Win95 on FAT32 (or exFAT?)... on a large hard disk, wasn't the smallest file size 32KB or 64KB?

Different times...

mananaysiempre · on July 29, 2022

I could swear this was the cluster size for FAT32 on a 2G drive, but apparently not[1], this is the cluster size for FAT16 on 2G. FAT is just utterly dumb in how it does space allocation: it doesn’t even have a bitmap, it’s just a gigantic dense array of linked list nodes (the eponymous file allocation table). Perfectly fine for a 360K floppy on a 32K machine, predictably painful for a 2G hard drive on a 32M machine. Add in narrow allocation unit (“cluster”) numbers, and you end up with huge allocation units.

As far as I can see, ext2/3 instead use a free bitmap and a tree-style “indirect block” setup that is recognizably similar to (though greatly extended from) V1 UNIX (1971) [2], predating even the original FAT8 (1977). Ext4 can do extent [that is, (start, end) pair] allocation instead, and XFS, ZFS, Btrfs, etc. are built around it. Thus they can and actually do manage disk space in smaller bits.

[1] https://support.microsoft.com/topic/default-cluster-size-for...

[2] http://squoze.net/UNIX/v1man/man5/fs or https://www.bell-labs.com/usr/dmr/www/pdfs/man51.pdf

tadfisher · on July 30, 2022

I was going to say that even the 5150 shipped with 256K, but I checked Wikipedia and they say the minimum spec was 16K. Which is astounding.

Spooky23 · on July 29, 2022

Good perspective. It’s easy to lose track of that as enough time has passed that the parents of commenters were in high school when this was implemented.

One time perspective thing to keep in mind is that the IBM mainframe was younger to the folks doing this work at Microsoft than we are to the Microsoft people today. Their constraints were different and the diversity of end users was very different.

Even with years of experience and advancement, in Linux we moved configuration to systemd, which is exponentially better than the registry but also more complex. Complicated configuration systems will always have pros and cons.

_akoy · on July 29, 2022

However, by default on NTFS, there will be 4096-N bytes of used space each time the specific hive needs to grow to accommodate those 10-100 bytes (this varied historically if we're talking about NT 4 and below, depending on volume size) due to format cluster size.

That said, it is still smaller than 100 files that are 10-100 bytes each taking up 4096 bytes, though with NTFS, those would be stored in the MFT anyways...

_akoy · on July 29, 2022

I should add to this, as we know file system filters (e.g., anti-virus) slow down access to read (and write, but irrelevant here) files from any file system format on Windows (NTFS, FAT32, exFAT, etc), storing configuration data in configuration files would be less optimal, though minimal in terms of performance (given your app wasn't constantly re-reading the file). This wouldn't be the case with the Registry, as the hives are already open.

wmf · on July 29, 2022

That overhead is not intrinsic to the concept of a filesystem. If you asked someone to design a filesystem specifically to store short strings they could easily come up with a low-overhead design while preserving the other advantages of real filesystems.

rwmj · on July 29, 2022

ReiserFS could do this ...

trasz · on July 29, 2022

Nobody in their right mind would put every field into a separate file though. (Yes, I do know about qmail.)

veltas · on July 29, 2022

Yes but that is the comparison being made, that the registry is like a filesystem and thus should have just been a filesystem. If the suggestion is to put many fields in one file then that's a different suggestion.

trasz · on July 29, 2022

There have been filesystems that don’t use strict block sizes - Reiser4 IIRC. It’s not done often because the savings are tiny and it complicates implementation a lot.

A bigger problem I can see is that Unix file system semantics is… hairy; see rename(2) atomicity guarantees for example. So it could make sense to replace it not with a typical filesystem, but something more RESTful.

veltas · on July 30, 2022

Again, the filesystem in context was the Windows filesystem, not a UNIX filesystem or Reiser4.

cpeterso · on July 29, 2022

Why is a registry file called a "hive"?

> Because one of the original developers of Windows NT hated bees. So the developer who was responsible for the registry snuck in as many bee references as he could. A registry file is called a “hive”, and registry data are stored in “cells”, which is what honeycombs are made of.

Source: Raymond Chen, 2003: https://devblogs.microsoft.com/oldnewthing/20030808-00/?p=42...

moffkalast · on July 29, 2022

That must sting.