Hacker News new | past | comments | ask | show | jobs | submit login
Why the Windows Registry sucks technically (2010) (rwmj.wordpress.com)
418 points by azalemeth on July 29, 2022 | hide | past | favorite | 332 comments



I really appreciate when people use technical facts to criticize something, like with this. It is a well written take-down of the implementation of the Windows Registry as of both today and 2010.

I suspect if the concept of the registry was created today it would look more like a database (e.g. Sqlite), although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout. The Registry's Key-Value pairs are 1:1 with a database table's column-values pairs, it is just the multiple tiers of organizing "folders" above that that are a difficult implementation detail.

The UNIX way is undeniably more flexible since it isn't a virtual filesystem, it is just a filesystem. The problem is that everyone invented their own configuration format to store configuration data in /etc and there's no format agnostic API to access that information (you can open it, but can you understand it?).

Both UNIX and Windows suffer from the same orphan issue wherein information can be written, the application removed, and it is unsafe to ever remove it since you may not know the author and or all consumers.


> The UNIX way is undeniably more flexible since it isn't a virtual filesystem, it is just a filesystem.

Unix simply doesn't have a "way" in that regard, other than a loose convention to put text files in "/etc". Every application comes up with its own format. Parsing that file is application-specific. Updating a value in a text file means re-writing the whole file again. It is a technically-inferior approach that has survived in time because text files are still text files.


> Unix simply doesn't have a "way" in that regard, other than a loose convention to put text files in "/etc".

So? The registry doesn't have much of a way either - the actual fields are simply a loose convention.

> Every application comes up with its own format.

Same with Windows applications - one application might store an IP address as a dotted-octet string, another might store it as a single 32 bit integer.

> Parsing that file is application-specific.

Same for Windows applications; the subtree for any application is very specific to that application and almost always differs from the subtree for other applications.

> Updating a value in a text file means re-writing the whole file again.

Not a problem, when it means that breaking a value in a text file breaks only that one application. Break the registry almost always breaks something else, if not the entire system.

I'm not saying files in /etc aren't without their problems, I'm saying that all /etc problems are already present in the registry, but the registry adds a few more of its own.


> the actual fields are simply a loose convention.

Let alone some windows program end up just save a blob in it and totally ignore the typing.

Who care about the field type if I can save everything in a single blob(?


Even Microsoft likes to save blobs.

[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\BrowserEmulation\ClearableListData]\UserFilter


I can’t and shouldn’t be able to write to /etc as an ordinary or guest user.

Er-readying values isn’t done usually with apps.

You can get corrupted files if two instances fight for the same file. If that’s your thing, please use mongo. Also, please provide support to all the users.

Lastly, if you messed up certain configuration or data files, your system won’t boot either on ANY system.

Sounds like you’re just one of many anti-windows people, exactly what was pointed out about people bashing the registry.


> I can’t and shouldn’t be able to write to /etc as an ordinary or guest user.

You're right, you shouldn't. If you are able to, your distro is very odd and I'd recommend seeking a new one.

> Sounds like you’re just one of many anti-windows people, exactly what was pointed out about people bashing the registry.

There's no need to attack people.


This is emphatically NOT true.

If you absolutely could not write to /etc, then you would never be able to change your password.

The passwd, chsh, chfn, and other utilities allow a non-privileged user to make controlled changes to privileged files via the setuid/gid system calls.

A user can trigger controlled writes to files in /etc.


> If you absolutely could not write to /etc, then you would never be able to change your password.

Since we're talking about Linux, I would refrain from using absolute clauses like "never".

systemd-homed allows you to create "portable" user profiles which contain everything pertaining to a particular user, including his password. So user profiles (including your files, etc) can be moved between computers simply by rsync'ing that directory, or putting it on a network share.

`homectl passwd` (which changes a portable user's password) does not require writable /etc.

I think I've seen a few other solutions like that.

https://wiki.archlinux.org/title/Systemd-homed

https://systemd.io/HOME_DIRECTORY/


Users can not write to /etc, those programs like passwd can because they have the suid bit on and run as root regardless of which user initiated the process.


It's more so that users can ask the administrator to write to `/etc/` for them.

The administrator in this case has an automated tool that handles this requæst but the actual user the writing occurs under is the root user.


That's like saying you've broken RSA because you can cause controlled reads. All you have to do is send the encrypted message to the intended recipient and wait for them to decrypt it.


That's being a bit pedantic innit?


Maybe in OPs opinion, using `passwd` to write to `/etc/passwd` is not the same thing as using the registry API to write to the registry?

Personally, I feel it is the same thing.


> I can’t and shouldn’t be able to write to /etc as an ordinary or guest user

I don't believe you can. What distribution is this?


Point about /etc is that it’s just one part of “configuration”.

There’s no /etc for users. That’s why we get all these dot.dirs. At least some apps use ~/.config/


"$XDG_CONFIG_HOME defines the base directory relative to which user-specific configuration files should be stored. If $XDG_CONFIG_HOME is either not set or empty, a default equal to $HOME/.config should be used."

https://specifications.freedesktop.org/basedir-spec/basedir-...


.config/ and .local/ are the de facto /etc for users, but there is the set of XDG_ environment variables that are intended to fill that role.


Well, my git, ssh, caches of what not, almost everything is in my home directory directly. It's even worse than "My Documents"


Git uses XDG_CONFIG_HOME https://git-scm.com/docs/git-config#Files - putting the config directly in your home is just a choice.


It doesn’t in macOS.

It’s not a choice. It’s the default. It’s like saying it’s a choice to run mongodb without authentication.

Lastly, why should I be using XDG Desktop variables on something that don’t haveA desktop environment?


„My Documents” isn’t the same as a home directory. Home directory in recent Windows versions is „C:\Users\CurrentUserName”. Or better: %homedrive%%homepath%.


I'm pretty sure .local is the per-user /usr, and .config is /etc


You’re both correct; a fair number of applications put their configuration in ~/.local/share even though that wasn’t the intent.


> Updating a value in a text file means re-writing the whole file again. It is a technically-inferior approach that has survived in time because text files are still text files.

It always seems like registries are trying to solve three different problems: configuration, status and process shared state. Registries seem to work poorly compared to text files for configuration, and work somewhat well for some level of shared state. I think to compare functionality with Linux, you would have to compare /etc, .configs and the /proc filesystem.

As for tecnically inferior, the *ix approach seems to have made the right tradeoffs, via the hand of Darwin, rather than some brilliant engineering insight. The article did a good job of explaining what's wrong with Windows' implementation of the registry. What is right with the *ix approach is that configuration files are usually read on startup, and only written to when configuration is changed (which should be infrequent). This works really well for server software, command line utilities, and an awful lot of GUI software. For some GUI software, particularly where you have really complex feature sets, we need to save state of widgets (i.e. saving the default zoom level) and text files may be problematic for this (i.e two instances of the app running), and this is where the registry really shines.


The biggest issue with etc is that it's owned by root and therefore read only from the application's point of view. So, a great fit for sysadmin-managed configuration, not so great for applications that are GUI-configurable. You end up with a tiered approach of defaults in lib, and cascaded overrides in etc and var.


> The biggest issue with etc is that it's owned by root and therefore read only from the application's point of view. So, a great fit for sysadmin-managed configuration, not so great for applications that are GUI-configurable. You end up with a tiered approach of defaults in lib, and cascaded overrides in etc and var.

What do you mean? You are mixing multiple things, most of var and lib are also only root writable (unless tied to a specific user). For GUI application you generally end up with a system where etc holds the system defaults and user configuration is in $XDG_USER_CONFIG which defaults to $HOME/.config. This is the case for the vast majority (>90%) of GUI applications.

If we're are talking state, that typically ends up $HOME/.cache (if it should be temporary) or $HOME/.local/var (if it should be persistent). There are also XDG variables for these but I forgot them at the moment.

So can you elaborate what you mean with your perceived problem?


Most Linux GUI apps either litter your home folder with dotfiles, dotfolders (both disgusting) or they follow XDG Base Directory specifications and place configurations in ${XDG_CONFIG_HOME:-${HOME}/.config}


Most don't even know of XDG. The reason is, there's no central documentation or route for the userland. By Linux, we understand, the kernel and to the kernel developers, that is true. So let alone the anarchy of the userland be handled by distributions and users.


https://specifications.freedesktop.org/basedir-spec/basedir-...

freedesktop.org is that central documentation. Sure, there are applications that don't implemente that spec but it is not for there not being any documentation but either because the maintainers of those applications don't care or because the applications are older than the spec and moving configuration is not always trivial.


I don't mind dot files or dot folders, but I'm very open to better ideas. My app works on Mac, Linux, and Windows though. Any thoughts?


There are several libraries that handle directory for you in the appropriate OS-specific manner.

One Rust example being https://github.com/dirs-dev/dirs-rs

> The library provides the location of these directories by leveraging the mechanisms defined by

> the XDG base directory and the XDG user directory specifications on Linux and Redox

> the Known Folder API on Windows

> the Standard Directories guidelines on macOS


It's a great thing about etc. If your application `foobar` wants to write something to /etc and doesn't have root privileges (which it probably shouldn't), create a subdirectory /etc/foobar with owner `foobar:foober`, and write there all you want. Many applications do that and it works fine.


Strictly speaking /etc should be relatively static, and dynamic data should go into /var, so you (or the package installation) would create /var/lib/foobar with proper ownership.


> Strictly speaking /etc should be relatively static

I really wish that was true but nasty stuff like wpa_supplicant/, resolv.conf (now usually a systemd-resolved managed symlink), NetworkManager/ and X11/ lives there, very mutable and often constantly changing under the hood.


I have not seen any applications creating a user with the applications name. How can one avoid a naming conflict? Kate (a text editor) might be installed on a system by Kate (a person). What I have noticed however is that there are daemons running as root and is used as a "middle man" whenever an application running as a normal user needs some extra privilege.


Desktop applications no, because they must run as the user (let's leave containers and sandboxes alone.) Servers yes, for example PostgreSQL creates the postgres user and runs as such.


OpenWRT worked around this by creating UCI, its own configuration system that it uses to generate all package-specific configuration files: https://openwrt.org/docs/guide-user/base-system/uci

Configuration for all supported packages is stored in files under /etc/config in UCI's own text format, and OpenWRT's init scripts generate service-specific config files based on the UCI settings when the service is (re)started. For example, to configure Samba, users don't edit smb.conf directly, they set UCI settings like samba.workgroup, sambashare.path, etc.

There's a `uci` command-line utility to read and modify settings, and UCI forms the basis for the OpenWRT web interface.

debconf does something similar for Debian packages, though I haven't seen any packages that use debconf to completely supplant vendor configuration formats the way UCI does. https://en.wikipedia.org/wiki/Debian_configuration_system


I hate this fanatical love of the UNIX way. I really hate that as of today Windows is the only non UNIX OS. I trully believe that this fact set us back. No more exciting new OSes only boring unix.


Is that you, Dave Cutler?

But more seriously, what do you want from an OS? Cleaner design? Plan 9 comes to mind. QNX is very elegant, too. Some microkernel thing? Fuchsia, but some say it's inelegant and overdesigned right off the bat. Minix has some nice properties but somehow nobody but Intel seems to use it. Also, QNX again (but as the name suggests it's similar to Unix)... Distributed computing? I don't know, seems OK to run it mostly in user space.

My favorite alternatives are Plan 9, QNX and L4 (I know it's not an OS), of which QNX is the only one I have actually used. Shame about Fuchsia, I believe the negative opinions because I looked at some documentation and code before I read them and had a sort of "THAT is supposed to be Google's better OS?!" moment.


QNX got a lot of things right, mainly in the area of interprocess communication. Interprocess communication came late to Unix/Linux, and it shows. Microservices under QNX work much better than under Linux. Interprocess communication is fast and works like a synchronous function call. Hard real time actually works with QNX. But QNX had nothing new on the database side.


My concern isnt about Windows or Linux specifically but that something is lost when the garden only grows two plants. In, say, 1980 there were several unique OS streams, each with their own quirks: UNIX, Pick, VMS, etc.

It seems strange that we are essentially trading on isotopes of OS thinking from about 2000 when Linux took the mantle of UNIX and Windows converged on the XP code.

Is this it? Is producing an OS now so incredibly expensive and hard that we are never going to try again, and just keep bolting things on to what we have? I feel a little sad at that, even if the reasoning is impeccable.


Maybe it converged, just like other industries? Say, cars or planes have not fundamentally changed in decades. But would there be other designs that would make them so much better that it would be worth changing? That's not clear to me...


There's no money to be made in making an OS. Back when I was a child people were literally waiting outside stores for a new Windows. Windows 11 is given away for free.


We'll always have TempleOS ! /ducks (RIP)


RIM (The Blackberry people) bought QNX. Do you know, if they have been good shepherds? Has that helped, or disadvantaged QNX as a viable solution proposal in the project space it had?.


I'm not well informed. AFAIK customers like Porsche and Ford (with the fairly successful Sync 3 infotainment system) are still using it. I do know that Dan Dodge, QNX founder, developer 1 and great guy (speaker of my all time favorite conference keynote), has left Blackberry. Rumor has it that he was unhappy with the new leadership and direction for QNX.


The people at nixos and guix are doing it differently. No longer is the configurarion just state in files, scattered who knows where that cannot be understood fully by anyone. Insteady everything is defined centrally and then neatly versioned and managed by the system. If you think the unix way of throwing files in a directory sucks, check them out!


Unfortunately, Nix's documentation sucks. Plus, there's a steep learning curve to the Nix programming language. I don't understand why they couldn't just use a language instead of inventing one that is only usable within the Nix system.

Overall, really cool idea (dropped my jaw when I first saw it in action), but poorly implemented.


Wouldn't say there's a steep learning curve for the language itself, it's pretty easy to get a grasp around it imo. Here's a helpful page I used to quickly get familiar with the language: https://github.com/tazjin/nix-1p

What's rather messy about Nix is nixpkgs with its helper functions all over the place alongside pretty shallow / non-existent documentation (which is unrelated to the language). Thankfully they've started to work on that recently: https://discourse.nixos.org/t/documentation-team-flattening-...


I think it's all too easy to say they should have just used an existing language. The most obvious feature of the nix language is that it's lazy, because they want their enormous collection of configuration dictionaries computed on demand. There are no mainstream languages that are lazy.

Well, there's Haskell, and I expect any Haskell programmer to be able to pick up Nix very quickly.

Guix has gone with a strict language, Scheme, but I understand they have their own monadic DSL to cope with the peculiarities of following the Nix model. So again, not something that a mainstream language can do well, and even as a Schemer, you're going to have to learn their macro language.


This sounds incredible, assuming there's a clean way to pin versions/source of applications. It might make the fabled idea of a reproducible system possible.


They're already getting decent reproducibility of their curated package set, but some of us want to pin to a particular version of this, especially when creating our own private packages. That's where I'm finding I really want pinning.

The latest release has included "flakes", promoted from an add-on to an opt-in feature and which, among other things, lets me pin my dependencies and the environment I need to build my private projects. I then don't have to worry about going out of sync with the main package set, and I can share private projects in a (hopefully) fully reproducible way, with people using different versions of nix.

https://xeiaso.net/blog/nix-flakes-1-2022-02-21


Guix makes a big deal out of being fully reproducible. It is possible to pin versions, you basically state "I want package X while the guix repository is at commit hash Y". Every dependency of the package will be built and made available at the appropriate version and everything will work out. I used this to roll back fprintd when it broke due to a broken dependency and it worked without problems.


I get where you're coming from, but I think the biggest pro for the "UNIX way" is that text on a filesystem extremely accessible. You don't need a specialized tool to read & modify configuration, you just need a text editor. And while there's no standard for how the data is structured, it's usually pretty easy to figure it out from context.

I think it's also really easy to underestimate all the tooling built around text. As soon as you try out some other way, you loose out on version control, diffing, grepping, and a whole bunch of other general tools built for text.

I think the only way to get around this is to build generalized tools for working with binary data as a data structure. The problem with this approach is that now you need to maintain a database that describes every possible binary format, where text editors only really have to concern themselves with ASCII & unicode to be useful for most cases.


General Linux tools built for text, in a world where everything is plain text and the shell doesn't know how to navigate anything other than a file system. In PowerShell you can cd into HKEY_LOCAL_MACHINE as easily as any drive, and Select-String is perfectly capable of searching registry keys.

The baseline is only text and files on systems that don't have anything better to offer.


Isnt there FuseFS for that? It can show the gconf registry as a file tree, allowing you to use tools like recursive diff etc. https://metacpan.org/release/LSIM/GConf-FS-0.01

Something flexible like FuseFS is sorely missing on windows, the stuff google drive had to do creating a new letter drive is pretty gnarly.



Interesting, I hadn't seen that. Did Google Drive go with the virtual drive letter thing mainly for backwards compatibility to older windows?


> The baseline is only text and files on systems that don't have anything better to offer.

Also pretty much all external programs on Windows. Anything that isn't itself implemented as a PowerShell cmdlet has this problem. And it's all shell-specific.


Not really, if it’s a .NET assembly Powershell can dynamically load, call and use types from it. Not to mention first-class support for standard data interchange formats like JSON, XML etc without having to convert back/forth from text or use tools to process it.


In a few situations, actually calling .NET code from PowerShell doesn't suck, like setting persistent environment variables.

But in tons of situations actually doing anything that way involves a ton of boilerplate, and it's very clunky. It's acceptable for automation, but it's not something that feels very good to do in an interactive shell.

Stuff like Convert-FromJson is nice, but it's not as nice as whole ecosystem there the interchange format for external programs is sort of just 'understood'. It's also not streaming— it reads in a single JSON object as a string. So you need some extra boilerplate if the expectation is that you're piping output from a command that yields a stream of data.

PowerShell is great, but an environment where streaming objects in some other format between different programs, regardless of the languages they're in or the runtimes they use, would be better and closer to a real version of 'Unix pipelines but with objects'.


Windows and macOS dominate because 99.9% of users do not care about the difference between registries, file systems, text or binary tools, UNIX-like, POSIX, etc… none of these things matter to them! They just want to use a computer and get on with their life.

If you lament this as a programmer, build applications that only work on your OS of choice, make them so good or take a dependency on a feature not available with macOS or Windows so that it can’t be ported, and then you can show the world the light of the UNIX mentality or whatever.


> Windows and macOS dominate because 99.9% of users do not care about the difference between registries, file systems, text or binary tools, UNIX-like, POSIX, etc… none of these things matter to them! They just want to use a computer and get on with their life.

The flip side of this argument is that the experience of using computers for that 99.9% of users sucks.

As software engineers, it's almost our responsibility to argue in favor of better platforms, because every day not spent fighting the platform is a day we could be building better experiences for our users. (Not to mention, every day the platform doesn't inadvertently mess up the user experience is a day of smaller support costs.)


The registry does not impact those users' experience. Needing to learn how to use a shell, does.


I would disagree that it does impact users' experience because when the registry gets corrupt you have to start by installing Windows from scratch. Nothing like Windows 10 update disabling registry backups to save space on low capacity storage and finding out after the fact when attempting to recover Windows 10.

Secondly with device drivers being tied to the registry there is not a simple system upgrade of taking out the hard drive and placing it into a new computer.

Just not everyday occurrences for most but still exist.


Windows these days works extremely well with just taking out the hard drive and placing it into a different computer, even across different processor brands completely different hardware etc.

I personally do this all the time. At most you get an extra restart the first time the drive is in a new set of hardware and after that you're good to go.

My main Windows install is probably 8 years old at this point, it's gone between multiple motherboards.

The biggest annoyance is a few pieces of software that tie activation to the motherboard.

If you want to make it even more portable you can install Windows to a vmdk.

And if you want to get a special complicated, you can have that vmdk act as essentially a secondary variant of your main operating system complete with symlinking most of the files and application info, to both save space and so you have most of the same state across both os's, but still can play around and easily roll back any changes.


The registry does not get corrupt. This is not a thing that happens without the system files themselves being corrupted. People blame any old problem on the registry the way they blame any old problem on /etc. Nor, in the case of an actual registry problem, do you have to reinstall Windows from scratch; restoring the registry from backup is the entire point of system restore points, which are created on every update and most program installs. As a bonus, yes, I did upgrade my computer that way. This stuff is pure superstition.


I've reinstalled Linux due to borked sound on update way more times than I reinstalled Windows due to corrupt registry.

And cumulatively I spent significantly less time on Linux than I did on Windows.


I'm not saying the UNIX philosophy is the "best" way, I'm just stating why it's valuable.

Of course most people don't care about those things, but as a developer, the main issue I see with Windows and macOS, is that they build specialized interfaces that lock you into certain ways of doing things, which may be convenient, but are difficult to migrate away from, and a pain to automate and reproduce.

I really value that the UNIX philosophy is geared more towards building simple tools that are designed to be combined with other simple tools to solve a more complex problem, and that it doesn't try to lock you into using any particular tool to solve a problem.

So no, I may not provide binaries for Windows & Mac for personal projects, but I'm not going to be openly hostile towards people who want to build things from source or contribute fixes for Windows & macOS.


What good does having a non-UNIX OS do if it insists on doing everything worse than UNIX?

The bet on "everything is an object" was a honest and competent one, but unfortunately it didn't work. "Everything is a file" works better on practice. Unfortunately, that was the last real attempt on Windows to improve things.

Yes, the lack of diversity in OSes is bad. But Windows doesn't fix it. (Android and iOS were the last large attempt of innovation there, and implemented some really good things, but it's also useless to have the OSes completely managed by large corporations that antagonize both their customers and society as a whole.)


> The bet on "everything is an object" was a honest and competent one, but unfortunately it didn't work. "Everything is a file" works better on practice.

With the right interface, the object approach can be more pleasant to use. Administration via PowerShell is way more consistent than in Bash.


In what way did it "not work"?

Unless you meant OS/2? My understanding is the original intent there was very much to make everything an object,and the UI fully composable based on that ( in the sense of being able to link things together semi-arbitrarily, like pipes, only they never got that far )

But that's based on a half remembered article from years ago, I may have that completely wrong.

Agree with the other post that PowerShell's object view of the environment is pretty nice - not always a panacea of course, but mostly highly functional and productive


The problem is that UNIX, while it reeks, it's good enough for most people to not care about it.

Plan 9 was supposed to be the better designed UNIX all around and it did not even dent it. It wasn't significantly better that UNIX to displace it.

http://www.catb.org/esr/writings/taoup/html/plan9.html

> We know what Unix's future used to look like. It was designed by the research group at Bell Labs that built Unix and called ‘Plan 9 from Bell Labs’.[154] Plan 9 was an attempt to do Unix over again, better.

> The long view of history may tell a different story, but in 2003 it looks like Plan 9 failed simply because it fell short of being a compelling enough improvement on Unix to displace its ancestor. Compared to Plan 9, Unix creaks and clanks and has obvious rust spots, but it gets the job done well enough to hold its position. There is a lesson here for ambitious system architects: the most dangerous enemy of a better solution is an existing codebase that is just good enough.


I think also that BSD (and later Linux) made UNIX free, and I assume Plan 9 was not.

UNIX was also written to run on minimal hardware. Even if Multics had been open-sourced, UNIX would still have won.


UNIX has been free since V6, that is why it got adopted by everyone, and then AT&T sued Berkley when they got the opportunity to actually be allowed to charge for it.


I agree on this. There may be better ways to do things, but people are so sure that UNIX perfected everything 50 years ago that the alternatives don't get explored. Personally I'm not convinced that plain text is naturally better than other options, because what even constitutes plain text? We have ways of encoding our languages into bits on a disk, but without a decoder it really doesn't have any meaning. So how is that different from any other way you can encode data? Isn't the important thing that we have tools that make it easy and consistent to work with the data?


Actually there are lots of exciting non unix OS.

https://distrowatch.com/dwres.php?resource=links

It's just that they don't have market share or a raison-d'etre for widespread adoption.


Be and Morph are worth a look.


It's interesting though that "boring UNIX" has survived, while most non-UNIX operating systems have gone the way of the Dodo ;)


I'd complement it: the fact "that as of today Windows is the only non UNIX OS" AND "this fanatical love of the UNIX way" sets us back.


TempleOS is radically different than either Windows or Unixy OS's.

I would actually not even compare it to them, and just say it's flat out Radical.


Modern Linux DEs have an arguably worse version by not only having traditional `/etc` for some types of configurations but _also_ having a "registry" that's really similar to the Windows one for GUI apps... (re: gconf for Gnome)


"Most" here means the one DE that had a Windows fan rewrite its entire config engine with the only goal of being like Windows. An action that created quite a lot of problems soon after the change.


Is Gconf any different from OSX defaults?


To be fair, Microsoft doesn't rely on the registry for complex configurations either, even IIS uses .config files.


You can’t look at HKEY_CLASSES_ROOT and claim with a straight face it’s not complex, and it’s a central piece to how a lot of things work in Windows.


HKCR has a lot of entries, but the data underneath those entries isn't particularly complex.

Its only really big misstep (apart from the fundamental issues with the registry in general) is that file associations and COM registrations are all intermingled in the same namespace. But those can be and are interconnected (see how the Office file types handle their registrations, for example), so it's kind of understandable how they ended up where they did.


The semantics of those entries and their possible subkeys is exceedingly complex. It started out reasonably simple, but features upon features were added with each Windows release, in addition to the application-specific behaviors.


> even IIS uses .config files.

Except that is for serving from a single file share ir a replicated site across a farm of IISes.

This has nothing with 'complex configurations'


Is there any "way" at all that doesn't suck though? Not having "a way" in this sense is more like a feature, a single source of truth works well if it really is treated as such by all actors.

Configurations under unix are a mixture of stdin, config files, env variables and bespoke solutions.

Under Windows, you have all of the above and on top of that the registry.

Any attempt at a solution in this regard risks being a rerun of the notorious xkcd: 14 ways to specify configurations -> this should satisfy everyone -> 15 ways to specify configurations.


On a more philosophical level, the Unix convention is the "liberal" one, giving application writers more freedoms. Liberty enables more flexible applications and more chaos at the same time. A highly structured approach might be beneficial in the short run, but it might prove to be burdening beyond its due date.


Application writers are free to use which ever format they choose on Windows, as well. The Registry is just yet another option -- Microsoft does not force developers to use it. I don't see the Unix convention as the more "liberal" one, here. It typically has _one less_ option.


> The Registry is just yet another option

It's not just "another option". It's a system standard option that exists, and can be easily accessed programmatically. There's no equivalent in Unix, since there is no standard, and whatever is available will depend on the distribution.

"Here's a built in option, but do whatever" is very different than "do whatever!".


Do you know where Microsoft calls out the Registry as the current best practice in https://docs.microsoft.com/en-us/windows/apps/?


My point was that having something built into the system is very different than having a dependency stricken free for all that is outside of the system. For this reason, especially for the majority of Window's life, it was not just "another option", it was a "hey, this is built in" option.

As your link show, Windows has homogeneity, with good config options built into the frameworks. In those frameworks, those built in config options can't be consider just "another option". There's no real homogeneity in *nix, so the only options is misc files to put misc paths.


Its partly about where burden is placed. Making it easy for developers sometimes makes it hard on users and admins. And some times it becomes security issue because of varied approaches. Not sure there is right answer especially in open source where its hard to get devs.


I would actually be better if most of the data that applications store in the registry would instead live in simple text files in the AppData directory (which shouldn't be hidden by default). The registry should be restricted to Windows configuration data (like file types and their associated programs), and information that needs to be shared between installed applications.


That's been the recommendation since Windows 95. It's easy to suggest and hard to enforce.


> Unix simply doesn't have a "way" in that regard

Yes. Nothing standardized in this area in UNIX/linux-land. We do have FHS, some xdg specifications and the series of configurations options that were born with dconf which resemble windows registry a bit.


> a loose convention to put text files in "/etc".

For user-specific configuration there's also ~, as with ~/.vimrc and ~/.bashrc. Windows has something similar in C:\Users\theuser\AppData.

edit And ~/.config/, as others have mentioned.


I'm a big fan of the macOS method where there's an OS API to manipulate the user defaults (as its called) but they are actually just stored as files in a standardized format (usually XML). You get the pros of the standardization and the pros of user manipulation (like easily deleting the corrupted settings of some app by removing one or two files)


Plasma/KDE has this with the kreadconfig5 and kwriteconfig5 commands, and it uses a simple INI-like format: https://userbase.kde.org/KDE_System_Administration/Configura...

GNOME has the gsettings command for this, which is the equivalent of the `defaults` command in macOS. Many guides still refer users to the `dconf` command, though, which is technically a lower-level tool but basically does the same thing for 99.999% of GNOME installations. (Users/distro-makers can actually choose the storage format on GNOME's case.)


At least that used to be the macOS method. Once Apple moved most of their apps to the cloud there are constantly open databases pushing settings updates from device to device. Things often start breaking these days if you try to fix things through direct file manipulation.

With luck you might be able to find the correct terminal commands to disable services and fingers crossed they don't automatically restart themselves before you are done editing.


> I suspect if the concept of the registry was created today it would look more like a database (e.g. Sqlite), although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout.

(Open)LDAP uses a hierarchical structure and saves its data in a database:

* https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Databa...

* https://www.openldap.org/software/man.cgi?query=slapd-mdb


What UNIX way?

The religious text files scattered everywhere on the FOSS clones?

The adoption of registry ideas by GNOME on gconf?

The configuration databases used by HP-UX and Aix?

The plists used by NeXTSTEP and macOS?

The settings per app used on Android (technically UNIX based at its Linux kernel)


Exactly this.

Unix is a bit of a mess, at least that's understandable because it's 'open'.

MS registry is a bit nutty, it should be refactored in some simple, clean way, which is apparently extremely difficult to do for big companies, especially those that consider 'backwards compatibility' to be 'everything'.


> especially those that consider 'backwards compatibility' to be 'everything'.

Thank god there are companies that still think about backwards compatibility and are not intent in breaking everything because there's a shiny new thing ... ahem ... "refactoring in some simple clean way"


Making things clear is not some kind of 'shiny new thing' that the kids are dancing to - it's rational.

'Backwards Focus Absolutism' means we live in a world of stupid cobwebs and terrible design. The Windows Registry is a hack, it was originally designed to do something mundane and simple. It just grew, like a virus.


> 'Backwards Focus Absolutism' means we live in a world of stupid cobwebs and terrible design

It also means that until ~Windows 7/10 you could run even win3.1-era apps. And you didn't run into situations like "any 32-bit apps no longer work" or "any apps not updated in more than a year will be removed" as we've seen with Apple.


Do you happen to know Linus' "first rule of kernel development"?

"If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. How hard can this be to understand?" - from his famous rant [1]

So Linux prioritizes backwards compat as well. It would be insane for any project relied on by half the computing world not to. MS just throwing out their registry and rewriting it would probably break a million programs and possibly break literal billions of people's computers in some way. Its a universal problem - the more dependants any piece of code has, the harder it is to change.

[1] https://lkml.org/lkml/2012/12/23/75


"It would be insane for any project relied on by half the computing world not to."

I think this is insane and narrow minded.

I suggest there should be 'eras' of version where things do change.

If they make OS changes, notify early, prepare basic materials, create 'auto-updaters' and allow a few years for change.

But it needs to be done.


Which is a reason no one has deep knowledge about all the ioctl variants out there on Linux, as new ones get introduced, sometimes to work around backwards compatibility guarantees.


And at least in unix you can quickly ripgrep through the usual home and /etc/ directories looking for something, whereas searching in regedit takes forever for some reason.



Ahh yes the big boy UNIXes that are on life support for the few remaining stragglers willing to pay massive support contracts on legacy systems.


You missed the "closed" UNIXes on my comment.


Doing that in a database is easy. You just write your keys on the "/root/path/subpath" format, and search the strings starting with some text. Most database engines have this kind of search heavily optimized. (And the ones that don't cost hundreds of thousands per core, so who cares?)

The one thing you lose by using a real database is that filesystems are only locally coherent, while databases try very hard to be globally coherent. If it has heavy access, the database way will lose performance much faster.


Agreed, and displaying this like a tree is then a UI concern.

I think you'd get slightly more performance if you were to store "Folder" and "Key" separately -- either three fields or using a parent-child table. You'd be able to index "Folder", and it would make building the tree structure itself easier and more efficient, and looking up all the values in a specific tree simpler.


You could do the paths in reverse (root last), to speed up any string comparisons, since prefix almost always matches and is maybe more wasted work the other way.


The concept of a system-wide hierarchical key-value store is actually very useful is some environments. I had to develop one some time ago [1] for an embedded system, since is super useful to have a single point of truth/configuration/state, if every application agrees on using it (which is the case in such systems were every application is known in advance and developed in house)

[1] https://github.com/debevv/camellia


Xen (the hypervisor) had a thing called XenStore which was a system-wide key-value store with triggers: https://wiki.xenproject.org/wiki/XenStore


> and there's no format agnostic API to access that information (you can open it, but can you understand it?).

One of the critiques in this article is that, because you have to know/guess/assume what encoding is used for various strings, there isn’t one for the registry, either.

If so, both approaches suffer from that (but Unix a bit more because it tends to store multiple items under a file system ‘key’, while Windows programs rarely store multiple items under a single registry key)

I do wonder what RegEdit.exe does here. Does it infer encoding, have a long list of key-to-encoding mappings, or a combination of the two?


I do wonder what RegEdit.exe does here. Does it infer encoding, have a long list of key-to-encoding mappings, or a combination of the two?

I did a bit of experimentation on this too and we think it has a heuristic to guess encodings of strings. (Which to be fair isn't a terrible idea - it's very easy and almost entirely reliable to determine if a string is ASCII/UTF-8 or UTF-16LE which are the major encodings found.)


Based on the documentation[1], it seems clear that strings are stored in Unicode (which should be UTF-16LE).

If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, and the ANSI version of this function is used (either by explicitly calling RegGetValueA or by not defining UNICODE before including the Windows.h file), this function converts the stored Unicode string to an ANSI string before copying it to the buffer pointed to by pvData.

[1]: https://docs.microsoft.com/en-us/windows/win32/api/winreg/nf...


The first thing to know about Microsoft documentation is it's almost always wrong.


I've been coding against the Win32 API since Windows 95. While I won't argue it's perfect, it's certainly nowhere near "almost always wrong".

Anyway, where in the hivex code do you handle these non-Unicode encoded REG_SZ values?

Btw, the comment for hivex_value_multiple_strings is based on a mistaken interpretation of the documentation. There's nothing contradictory to what MoveFileEx does[1], it simply has a list with a single entry in the case of deletions, and a list with two entries in case of renames.

The REG_MULTI_SZ documentation[2] just points out, correctly, that you can't have a zero-length string within a list of other strings (ie with at least one non-empty string after it). This is of course obvious and hence redundant, but they highlight it for novice programmers.

[1]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/n...

[2]: https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis...


This is categorically incorrect. The documentation for the function is entirely accurate. What is true is that the regedit program is more than a featureless wrapper around the API functions (unlike the first thing to know about Linux desktop software). The heuristic is performed for the user who probably doesn't know what they want; the straight exactly-as-you-asked-for-it conversion is performed for the programmer who does. (And it's only a heuristic for values; the encoding for keys is assumed from the file like TFA says, but which one it is is stored in a flag bit.)


To be more precise, like most of Windows, it just stores 16-bit codepoints as such, meaning that it's not necessarily always valid UTF-16LE (in the presence of invalid surrogate pairs).


Maybe there isn't a database engine that explicitly supports file system data structures, but you could implement a filesystem in the application layer using SQLite as a storage mechanism.

Here's an example of someone doing that very thing.

https://github.com/guardianproject/libsqlfs


For those looking for a more up to date alternative, with a non-hardcoded DB file path, try https://github.com/greenbender/sqlfs (not affiliated, just had a look in this area a few weeks ago)


> The UNIX way is undeniably more flexible since it isn't a virtual filesystem, it is just a filesystem. The problem is that everyone invented their own configuration format to store configuration data in /etc and there's no format agnostic API to access that information (you can open it, but can you understand it?).

It's an interesting point and it would really be nice if there were some kind of standard, especially a standardized C library within the standard C library for configuration files, but in practice almost all of them are also easily understood `key=value` pairs that only differ by how they denote this.

Still, it would be very nice to be able to inspect and modify them by standardized tools but in practice almost any configuration file opened is obvious.


There is nothing wrong with a kv hierarchy for config. Used in more modern tools as well like Consul. The problem here is the poor implementation and naive security considerations.

Today, there is no broadly used file system that can replace any of this. The difference in performance is just too big.


The problem of removing applications together with their configurations is trivially solved in the various Linux-based systems where any software package is installed only inside a private directory.

Any files that would be expected to be in shared directories like /usr/bin, /usr/lib or /etc, are replaced by symbolic links. Except for symbolic links, the installation of a package must not change anything outside its private directory.

In my opinion this is the only sane way of managing software packages, because otherwise, also on Linux, but especially on Windows, I have wasted far too much time during the years with debugging various problems caused by the installation/uninstallation programs.


You can do similar on windows. However, the registry was turned into trying to solve a slightly different issue of roaming users, and centrally managed settings, split by machine and user, and using what became active directory. They started off with OLE and its central store of holding name value pairs in a tree. Basically making the registry do at least 4 different things. Only one of them it does 'ok' (COM/OLE lookups). In practice it became a huge mess because thousands of applications now keep their settings in there and poor cleanup practices like you see. When like 99% of the use cases out there would be perfectly well suited to just using the forever deprecated win32 INI API to manage settings and MS just saying 'put your config files in these places for different effects'. Instead they said 'put it all in the registry'. Looking back at it, it is now 100% clear it was a bad design decision.


It should be noted that putting configuration into files rather than registry has been the standing recommendation on Windows for a very long time now. For example, .NET 2.0 (2005) added a standard facility for application settings, and it was implemented on top of XML .config files.


Good to see MS practising what they preach on newly written applications:

https://docs.microsoft.com/en-us/deployedge/configure-micros...

> You can also use REGEDIT.exe on a target computer to view the registry settings that store group policy settings. These policy settings are located at this registry path: HKLM\SOFTWARE\Policies\Microsoft\Edge.


This is a group policy, not a setting. That makes a big difference in the right context - policies are meant for things that are typically centrally managed in enterprise environments, so the ability for the admin to remotely control every setting individually is important, and requires some kind of central system registry for OS tooling to work with.


Yep, but it still sucks if you are a user trying to figure out how a setting is applied, first delve into some seemingly proprietary config db for edge, and then realise its actually configured via a registry gpo.


>the various Linux-based systems where any software package is installed only inside a private directory

Which are these? I assume Qubes OS sidesteps the problem entirely... NixOS?


NixOS, mostly, but not for application data (/var usually), which is closer to the registry.


> although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout

Apache Jackrabbit Oak [1] implements a tree-format data model called JCR [2]. It's used as database engine for the Adobe Experience Manager (enterprise-level CMS).

[1] https://jackrabbit.apache.org/oak/docs/ [2] https://en.wikipedia.org/wiki/Content_repository_API_for_Jav...


"The format is... endian-specific."

Wasn't there a SPARC port of Windows? Did it run on any other big-endian machines? Were the MIPS and POWER ports little-endian?

In any case, I can see why Microsoft paid SQLite for a custom set of features.


It's a very good question! There was also an Alpha port (both-endian and 64 bit). I've never seen a SPARC, POWER, MIPS or Alpha Windows registry so I don't know if hivex could decode them.


Larry Osterman said in 2005: A decision was made VERY long ago that Windows would not be ported to a big-endian processor.

https://web.archive.org/web/20190108142751/https://blogs.msd...


> I suspect if the concept of the registry was created today it would look more like a database (e.g. Sqlite), although organizing it like a virtual FileSystem does have a certain appeal, and unfortunately I don't know of a database engine that supports such a layout.

I think you might mean relational database. IIRC, the registry is already a database, just not a relational one.


AIX has a registry-like database for configuration called ODM that is like you’re describing.


If I was to reimplement it, I would not let any app read the DB directly.

Instead I would make an HTTP like interface where you talk to a service that provides the get/set functionality and which always use a text format like an extended JSON with native support for date, int32, etc.

This also enabled much easier and better backwards compability as you can specify app-version in the requests, so a version 5 server can respond in version 4 format.

With a service it doesn't matter if things are stored as a single sqlite DB or as multiple files.


You can do everything you describe with a regular function call. Introducing an unnecessary service for something as trivial as config management is even worse than the broken Registry status quo.


In theory all we need is a libconfig with a standardized interface and distro specific implementations.

There would be two use cases, storing hierarchical/tree like data aka JSON, YAML, etc and arbitrary graph like data with complex references.

The same could be done with the shell. libcli would be called to parse the command line arguments and your program will have an entry point that receives the parsed data.

libcli could also be used to produce structured program outputs which then can be fed into applications, possibly skipping the serialisation deserialization steps.

Since libcli will have a standard interface each distro can choose their own CLI format or whether they output their data as JSON or CBOR.

In both cases the benefit is that the choice of the data storage mechanism has been decoupled from the application.


Augeas does this already: https://github.com/hercules-team/augeas


Is this sarcastic?

I mean, the versioning part sounds good, but JSON like request?

I do believe people don't know what inefficient/efficient a system can be depending on the data format they are using.


That's sort of already the case. Normal apps aren't really supposed to read it, the kernel hides the implementation, and get/set/etc is all system calls.

It's tooling created for security and forensics that tries to manually read it.


https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis...

Registry settings already have ACL items applied. You can see this behavior if you log in as a non admin user then try to view another users registry settings.

https://docs.microsoft.com/en-us/windows/win32/sysinfo/regis... The thing is pretty free form with a choice of some very basic data types dword/qword and strings ascii/unicode/lists of strings.

Adding a wrapper around the existing call to go towards a json schema registry could be a helpful thing.


All the aspects you mention can be realized by a programmatic OS API as well. No need to sandwich a network layer and a weakly-typed data format (JSON) in between.


I assume the service will run in a docker container too??


The article's complaints about the format being undocumented are ill-founded, because Microsoft repeatedly tells developers not to read the registry directly, and call the exquisitely documented functions instead. The applications that didn't listen to Microsoft and depended on particular features are most of why it hasn't updated to modern standards in the first place.


How's that supposed to work if you're modifying it in an offline image from a Linux host?


With DISM.


DISM is a proprietary Windows tool, so no use at all for modifying an offline image from a Linux host.


I'm sure it runs under Wine. But knowing that that's how you service the image means you have no reason to be using a Linux host to do it.


the registry needs to be read by device drivers including the drivers for tcp and nics, including during very early boot. this would make HTTP a challenge. Also implementing a json parser in the kernel sounds like a bad idea


hence I’ve always appreciated how mac OSX has done it: filesystem, but with a standard file format (plist files). Any well behaved mac app, if it misbehaves I can just go and delete the app settings plist and it’s just like a reinstall, since everything else is in a readonly bundle.


Heierarcical databases support the registry format. But they are popular in the 70s?


key/value pairs


I don't feel like this article is good enough to be worth resharing 12 years later.

It missed some of the key reasons why the Registry is bad and other criticisms ("We have to write exactly the bytes Windows expects." yeah of course you do. It's only meant to be changed using APIs.) are not very well thought out.

The criticism that the Registry is a janky filesystem in a single file is dismissed because you can do the same thing with ext3. That's not a good reason to dismiss it. Just because you can do a similar thing doesn't mean it is a good thing to do for the Registry. The Registry being a single file is a cause of a ton of problems, and is IMHO the root of all the other problems. If this were a better technical article it would go into that instead of hand waving it away. (E.g. it being a single file leads to the all or nothing nature of it. It necessitates the bad filesystem implementation instead of using the actual underlying filesystem, etc.)


It sounds you misunderstood at least parts of the article. The ext3 comparison doesn't support or dismiss anything, it's to explain how there can be a filesystem in a file. They then go ahead and say how it could be a good single-file format if it were more database-like instead of filesystem-like. Other criticism of the format clearly take into account that it's a historically grown format which explains some of its idiosyncrasies, but that still doesn't mean it's not valid criticism. Especially the mentioned inconsistencies regarding data types and encodings can hardly be attributed to the registry being single-file, as you seem to suggest by saying "The Registry being a single file is a cause of a ton of problems, and is IMHO the root of all the other problems."


I didn't misunderstand anything. I read this in 2010 and just re-read it.

The ext3 example is used exactly as I describe, to hand wave away "having a file system in a single file" as a criticism and just move on to talking about nitpicks with the Registry's implementation of a filesystem in a file, while ignoring that all of the problems with the registry spawn from the choice to have it be a filesystem in a file.

This article is like going to a house that burned down and saying "well yes, the house burned down. But barns also burn down. Now see one problem is that the couch is a pile of ash and cinders. These materials do not make for a good couch. Better couches use foam and fabric."


> while ignoring that all of the problems with the registry spawn from the choice to have it be a filesystem in a file.

The whole point of that paragraph is that the problems it has as a filesystem do not spawn from the choice of having a filesystem in a file.


Actually the Registry is not a single file, it is a set of files virtually assembled into a single "object" that has many points in common with both a database and a filesystem.

Personally I see it more like a filesystem, with some striking resemblance to NTFS.


It is a single file to the underlying filesystem. Yes it reimplements a filesystem, I already said that in the comment you are replying to.


It isn’t a single file. There are multiple “hives” to the registry. There is at a minimum the SECURITY, SAM, SYSTEM and BCD “hives” that are all individual files on disk. Additionally, individual UWP apps can have their own isolated hive if need be.



Exactly like a NTFS drive can have mountpoints that are actually other drives, the Registry on disk is made by several files, at least:

SAM

SECURITY

SYSTEM

SOFTWARE

NTUSER.DAT

and since Vista

BCD

The whatever is called the Registry when virtually assembled together is accessed as if it was a filesystem on a single file, is not in itself a "file", it is something else, or if you prefer it seems like being a monolithic file but it is backed by a number of separated files.


Its a "view"


It's far from being a single file. Registry is split to hives. Besides that, each hive is a bunch of files (actual hive, a pair of integrity logs - which was just one log file before Vista - and hell knows what else, some KTM-related stuff I presume):

   > dir /a C:\Windows\System32\config
   ...
   29.07.2022  12:33        44 040 192 COMPONENTS
   21.11.2010  10:21             1 024 COMPONENTS.LOG
   29.07.2022  12:33           262 144 COMPONENTS.LOG1
   14.07.2009  05:34                 0 COMPONENTS.LOG2
   29.07.2022  12:33            65 536 COMPONENTS{016888b9-6c6f-11de-8d1d001e0bcde3ec}.TM.blf
   06.03.2021  21:44           524 288 COMPONENTS{016888b9-6c6f-11de-8d1d-001e0bcde3ec}.TMContainer00000000000000000001.regtrans-ms
   29.07.2022  12:33           524 288 COMPONENTS{016888b9-6c6f-11de-8d1d-001e0bcde3ec}.TMContainer00000000000000000002.regtrans-ms


Much like SQLite or what have you. What specifically is problematic about that, or what would be better?


A point being overlooked by article and commenters: the records in the registry probably use in the order of 10-100 bytes each, while every actual file takes up at least around 1024 (EDIT: maybe more like 4096?) bytes. Putting tiny 'files' into a special format is a no-brainer on the older computers that existed when the Registry was first created. I am assuming this was one of the motivations to create the Registry in the first place, to save space.


Extracted from the "Rationale" section of https://en.wikipedia.org/wiki/Windows_Registry

1) Since file parsing is done much more efficiently with a binary format, it may be read from or written to more quickly than a text INI file.

2) Strongly typed data can be stored in the registry, as opposed to the text information stored in .INI files.

3) Because user-based registry settings are loaded from a user-specific path rather than from a read-only system location, the registry allows multiple users to share the same machine, and also allows programs to work for less privileged users.

4) Backup and restoration is also simplified as the registry can be accessed over a network connection for remote management/support,

5) It offers improved system integrity with features such as atomic updates.

These points are mostly bogus IMO, but apparently this was their initial rationale for implementing it.


Almost nobody mentions another rationale or use case: the registry is accessible from kernel mode. This makes it, in addition to the other things, kind of analogous to sysctl.


I don't really see a reason to believe this list is comprehensive, especially looking at the sources.


> These points are mostly bogus IMO

Why?


On ext4, a file of 160 bytes or less (and no xattrs) can be stored inline in the inode[1], so the whole thing takes up 256 bytes (plus 8 + [length of filename] for the directory entry). I don’t think any Unix filesystem did that in 1989, but the problem is not unsolvable, especially given that you are hardly going to use a configuration file to store a single value. NTFS does the same, actually, except it can’t count that low, so you get 1K.

(Of course, before there was the registry there were the textual CONFIG.SYS and WIN.INI, but those were shared and would be easily corrupted by programs trying to modify them manually, which I suspect the registry is a reaction to.)

[1] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inli...


> On ext4, a file of 160 bytes or less (and no xattrs) can be stored inline in the inode[1], so the whole thing takes up 256 bytes (plus 8 + [length of filename] for the directory entry).

It can if you enable an option that is not widely used, IIRC.


Huh, right you are: the inline_data feature is from 3.8 (2013), but apparently tune2fs couldn’t enable it on a live filesystem for some time after that, and it’s still not on by default.

(I remembered seeing the feature in the format description, but did not think it would actually be unused in most cases.)


>NTFS does the same, actually, except it can’t count that low, so you get 1K.

Not exactly, a $MFT record (on 512 bytes/sector media) is 1024 bytes, the most you can store in it (it slightly varies depending on the length of the filename and on the exact way the file is written) is 744 bytes max.

On 4K disks, the $MFT record is 4096 bytes and allows up to 3776 bytes max.

JFYI:

https://www.forensicfocus.com/forums/general/mft-resident-da...


I'm trying to remember Win95 on FAT32 (or exFAT?)... on a large hard disk, wasn't the smallest file size 32KB or 64KB?

Different times...


I could swear this was the cluster size for FAT32 on a 2G drive, but apparently not[1], this is the cluster size for FAT16 on 2G. FAT is just utterly dumb in how it does space allocation: it doesn’t even have a bitmap, it’s just a gigantic dense array of linked list nodes (the eponymous file allocation table). Perfectly fine for a 360K floppy on a 32K machine, predictably painful for a 2G hard drive on a 32M machine. Add in narrow allocation unit (“cluster”) numbers, and you end up with huge allocation units.

As far as I can see, ext2/3 instead use a free bitmap and a tree-style “indirect block” setup that is recognizably similar to (though greatly extended from) V1 UNIX (1971) [2], predating even the original FAT8 (1977). Ext4 can do extent [that is, (start, end) pair] allocation instead, and XFS, ZFS, Btrfs, etc. are built around it. Thus they can and actually do manage disk space in smaller bits.

[1] https://support.microsoft.com/topic/default-cluster-size-for...

[2] http://squoze.net/UNIX/v1man/man5/fs or https://www.bell-labs.com/usr/dmr/www/pdfs/man51.pdf


I was going to say that even the 5150 shipped with 256K, but I checked Wikipedia and they say the minimum spec was 16K. Which is astounding.


Good perspective. It’s easy to lose track of that as enough time has passed that the parents of commenters were in high school when this was implemented.

One time perspective thing to keep in mind is that the IBM mainframe was younger to the folks doing this work at Microsoft than we are to the Microsoft people today. Their constraints were different and the diversity of end users was very different.

Even with years of experience and advancement, in Linux we moved configuration to systemd, which is exponentially better than the registry but also more complex. Complicated configuration systems will always have pros and cons.


However, by default on NTFS, there will be 4096-N bytes of used space each time the specific hive needs to grow to accommodate those 10-100 bytes (this varied historically if we're talking about NT 4 and below, depending on volume size) due to format cluster size.

That said, it is still smaller than 100 files that are 10-100 bytes each taking up 4096 bytes, though with NTFS, those would be stored in the MFT anyways...


I should add to this, as we know file system filters (e.g., anti-virus) slow down access to read (and write, but irrelevant here) files from any file system format on Windows (NTFS, FAT32, exFAT, etc), storing configuration data in configuration files would be less optimal, though minimal in terms of performance (given your app wasn't constantly re-reading the file). This wouldn't be the case with the Registry, as the hives are already open.


That overhead is not intrinsic to the concept of a filesystem. If you asked someone to design a filesystem specifically to store short strings they could easily come up with a low-overhead design while preserving the other advantages of real filesystems.


ReiserFS could do this ...


Nobody in their right mind would put every field into a separate file though. (Yes, I do know about qmail.)


Yes but that is the comparison being made, that the registry is like a filesystem and thus should have just been a filesystem. If the suggestion is to put many fields in one file then that's a different suggestion.


There have been filesystems that don’t use strict block sizes - Reiser4 IIRC. It’s not done often because the savings are tiny and it complicates implementation a lot.

A bigger problem I can see is that Unix file system semantics is… hairy; see rename(2) atomicity guarantees for example. So it could make sense to replace it not with a typical filesystem, but something more RESTful.


Again, the filesystem in context was the Windows filesystem, not a UNIX filesystem or Reiser4.


Why is a registry file called a "hive"?

> Because one of the original developers of Windows NT hated bees. So the developer who was responsible for the registry snuck in as many bee references as he could. A registry file is called a “hive”, and registry data are stored in “cells”, which is what honeycombs are made of.

Source: Raymond Chen, 2003: https://devblogs.microsoft.com/oldnewthing/20030808-00/?p=42...


That must sting.


Our digital world, built by trolls.

Most of them WASPs.


There's a mountain of valid complaints for the windows registry, but it not being a database is not one.

It's a pretty standard (albeit very simple) hierarchical database like IMS.

https://en.wikipedia.org/wiki/IBM_Information_Management_Sys...

Hierarchical DBs have fallen out of favor a bit, but they were the first DBs.


"2. Hello Microsoft programmers, a memory dump is not a file format"

This is exactly how Office file formats worked in the good old times. Made things faster (no parsing).


I also honestly doesn't mind this relic of the "old days", and other issues like it brought up in the article. The Registry is still very frequently accessed by Windows, often several times per second. I don't mind that the code to support that is making a ton of assumptions about the details that should never change anyway as it's hidden behind the Registry API. For all I care, keep that code simple and as performant as possible.


This is often misunderstood. Unless you're talking about very early (pre-COM/OLE) versions of Office, the binary Office file format is https://en.wikipedia.org/wiki/COM_Structured_Storage - which is basically FAT-in-a-file, with individual files meant to correspond to COM objects. So it's a memory dump in a very broad sense of a serialized object graph, not in a sense of actual bytes of RAM directly copied to disk.


> Made things faster (no parsing).

And simple. I parsed a wav file for the first time awhile ago and it was surprisingly refreshing how easy it was to parse. I just made some C structures in the format of the specification, and I could just read the file incrementally into different structs. This is way simpler than writing a parser and then marshalling data back and forth between an internal format and the configuration format.


Wav files aren't a memory dump, unless you consider any file output to be a memory dump; certainly, not in the sense that old OLE (ha) files were.


For this particular version of software, compiler, and computer architecture, yes.

After the first major upgrade, there's likely to be a whole lot of parsing involved.


I don't understand how people figure out what registry entries to make out of the ether for certain changes.

I don't know why most programs have an in-program settings menu, an external settings.json or something similar, and then registry entries to configure similar things.

I don't know why Windows has almost 0 policing about programs editing the registry. If you uninstall a program, why is it so hard to remove all entries? Why do entries usually from deleted programs sometimes turn into random strings of characters?

The whole thing reminds me of developers installing to %APPDATA% without my agreement, because they want to circumvent getting admin permissions. That's the whole point of admin permissions. Also if I don't want everything installed on my C drive, because it's a smaller sized SSD, I'm just out of luck.


You can move %appdata% - kind of:

https://superuser.com/questions/1250288/can-i-move-my-appdat...

I don't understand the rest, to be honest. For many years, the mess was even larger, because config files were coupled with the software itself.


> I don't understand how people figure out what registry entries to make out of the ether for certain changes.

Experience, trial and error, documentation, reverse engineering... you can use something like ntregmon to see everything a program touches in the registry and work from there. The quantity can be overwhealming and when you finally whittle it down you realise what you want isn't there, it was in a .ini in a random 90's file path the whole time.


This is mindboggling to me. One of the richest companies in thew world just leaves this janky partially documented mess for developers to wade through like mud.

How can MS claim to hire 'the best of the best' when such a commonly interfaced piece of their product is garbabe?

Why isn't it refactored to be clear, concise, with simple documentation and tutorials?

If we were to be cynical we could say 'they don't want you to because competitive this-or-that' - ok, money would be a reason. But it's not that. It's just a turd.

Mac is a bit similar, in that it's hard to find what goes where and why, with all of the examples.

The entire development world before mobile I think had this anti-product attitude: they don't give a s**. You're 'stupid' for not having it figured out. The 'documentation is there!' (but really only in some obscure way). People move on, nobody cares. Who in MS has the job to 'care' about such a thing anyhow?

These kinds of things drive me bananas. These are the biggest and richest corporations to ever exist. It should be clear, documented, with tutorials, examples and searchable.


backwards compatibility. that's why.


Plus, it's inscrutable and malware developers seem to know it better than legit developers do.


There's no reason for a legit developer to know the registry format, as you're supposed to interact with it using syscalls. Most of this article takes place in a parallel universe where Microsoft forgot to provide syscalls to access the registry, and you have to open its file store directly or something.


I'm referring to the keys and values


This is a weird article because you can't talk about the structure of the Windows Registry without talking about INI files [1]. Example:

    [owner]
    name = John Doe
    organization = Acme Widgets Inc.
Some comments on the post mention INI files. It's mentioned by commenters in the previous HN submission too.

But the Registry was built like it was to easily translate INI files into a semi-filesystem structure.

[1]: https://en.wikipedia.org/wiki/INI_file


In fact, if I recall correctly, in Windows 95 badly behaved 16 bit Windows apps that tried to put their own INI files in C:\Windows had those file writes/reads silently redirected to a part of the then new Registry.


In 16-bit Windows, there were several global shared INI files, most notably \WINDOWS\WIN.INI, that effectively operated much like registry. You can still see this reflected in the Win32 API pertaining to INI files - there's Get/SetProfileString which does not take a filename as an argument, and then there's Get/SetPrivateProfileString which does.

Some Win16 apps would add their entries to that file - but this wasn't "badly behaved" at the time, as evidenced by the fact that e.g. SetProfileString specificaly has an argument for "app name", so the ability to do so was intentional and documented.

If I remember correctly, it was in Windows NT - where \WINDOWS became read-only for non-admins - that they started doing redirections; and even then they are done specifically for those global shared .INI files. The mappings are configurable, with configuration itself defined by registry keys - look at HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\IniFileMapping to see what gets mapped where.


The hierarchical nature of the registry elevates old INI files to at least TOML format in this regard :-))


Filesystem fetishization at its finest. Not only is /etc poorly structured (just like portions of the registry are), it also suffers from lack of uniform format for the data. But hey, registry is bad because it's bad.


Both can be true :)


It is easy to be critical of Microsoft and many of their decisions until you reflect on the reality that they have provided the software that has been running billions of computers world-wide for almost four decades. And they have done this with an astounding level of software and hardware compatibility across time, devices and technologies.

This is critical to understand before pointing at anything MS and being critical. The desktop world overwhelmingly runs on Windows. This has been the case for decades. And the reason is backwards compatibility.

Imagine having a requirement that every decision you make must not break backwards compatibility. And then imagine having to stay as true as possible to that rule for 40 years.


> And then imagine having to stay as true as possible to that rule for 40 years.

And then imagine that most of your original decisions were poor ones, you won your position by virtue of being an abusive monopoly, and as a result, the majority of the desktop world has been needlessly suffering under the burden of your poor technical decisions for 40 years.


> burden

The slow boot and shutdown times, and the interminable upgrades are almost inhuman.

It seems like you can upgrade Ubuntu to a new major version faster than Windows can install a month's worth of fixes.

Especially when the Windows updates are unannounced and suddenly it's going to take 30 minutes to leave your desk instead of 30 seconds.


> It seems like you can upgrade Ubuntu to a new major version faster than Windows can install a month's worth of fixes.

Nobody cares. It is beyond obvious --brutally, massively so-- that this is not important at all. Not even a little.

The proof is simple: Billions of users have done just fine for decades.

The only people who ever complain about this are technical folks who often view the world from a perspective that is far removed from average-user reality (the billions of people I mentioned before).

I think it is extremely important for engineers to be able to understand that they don't often view the world in terms that actually matter.

One of my favorite go-to examples for this are the endless (and pointless) arguments about vi/vim vs. whatever.


Microsoft's secret is, and always has been, incompetent competition. If you invent a time machine, don't kill Bill when you go back. Build something that's genuinely better for both users and developers.


> Build something that's genuinely better for both users and developers.

When I was much younger I, too, thought this way. Reality, however, does not align with this at all. Better products do not necessarily win. I might even go as far as saying that they almost never win. The vast majority of products --not just software, anything-- are mediocre at best. What they do is solve a problem.

One thing engineers, developers have to work hard to get out of their heads is that the actual users of software, operating systems, applications, websites could not care one bit about anything that happens under the hood. Nothing. Not one bit. To put it plainly, they doing give a shit. At all.

When someone wants to order a pizza online, write a document on their laptop or play some music, all they care about is that thing. Nothing else matters. You could have a dozen hamsters trained to push buttons behind the scenes for all the care.

It truly does not matter. Internalizing that realization, is, in my opinion, an important step in the evolution from being a junior engineer to someone with enough experience to understand reality enough to get the job done.

This does not mean creating garbage. It does mean not being a complete pain in the ass about how to deliver a solution that users will, well, use.

This argument about the Windows registry is just silly. And the proof is simple: BILLIONS of people are benefitting from what this software does. Billions. The only people complaining about it are engineers not experienced enough to understand what they think is utterly irrelevant when compared to the scale and success of the SOLUTION the software provides.

In my forty years of hardware and software development I have yet to find a single piece of technology I could not criticize to one degree or another and, given the opportunity, improve. And yet, with enough experience, you realize this is the wrong metric to focus on.

People's needs are served by solutions. Their lives are improved by software and hardware that solves the problems they have. The person in the hospital recovering from a heart attack gives two shits about the Windows registry, even though nearly the entire computing chain that was used to save his life was likely run on Windows computers. That's reality. The rest is geeks not understanding that they don't know what to focus on or how to actually evaluate the value of a solution, which very often turns out to be far less than ideal and can always be criticized from a distance, both in time and space.

EDIT:

> incompetent competition

Serious question: Do you stop to read and think about what you are saying? I am not attacking you at all. I am just trying to understand how you might justify this perspective.

"incompetent competition"?

This is the company that has f-ing owned home, desktop and enterprise computing for what, FOUR DECADES? Incompetent? C'mon. They have solved problems for people and companies large and small for decades. The world has been running on MS solutions for longer than some of the people reading this have been alive. That is far from incompetent competition. Very far.


Serious question: Do you stop to read and think about what you are saying?

I don't have to. I was around when the "competition" consisted of companies like Lotus, Borland, Digital Research, and WordPerfect... to say nothing of IBM and Netscape.

All of which reinforce my point nicely. Microsoft succeeded because they sucked less than everybody else. Nothing more, nothing less.


Not true.

It had nothing to do with sucking less or being better (whatever that means).

Sometimes success is about doing enough important things well enough while your competition does less or simply gets in their own way.

That was the case for IBM back then. I remember buying my first original IBM PC. The experience was definitely what one expected from the International Businesses Machines corporation. Not what that market ultimately needed. They started it, failed to execute and lost prominence. Microsoft took advantage of that, and more.


They started it, failed to execute and lost prominence. Microsoft took advantage of that, and more.

So we're in violent agreement, then.

Although I was thinking of OS/2 rather than the PC itself when I cited IBM as an example, your example is valid as well. Market leadership, like control of a car, isn't usually lost but rather given up.


And it is "of its time" - it appeared in Windows 3.1 which was 1992. Which was the era of floppy disks as well.


Well, no, the registry in Windows 3.1/3.11 was vastly different in format and at the time it was actually a single file, reg.dat:

https://devblogs.microsoft.com/oldnewthing/20120521-00/?p=75...


>Hello Microsoft programmers, a memory dump is not a file format

Wait until you see the Office formats without the extra x in the extension.


yeah, such a classical Microsoft move


it was a very common approach at the time.


That was a deliberate attempt to make it impossible for competitors to read and write their format wasn’t it?


I don't know the exact year that file format was made, but you had to think about performance more than wasting so much memory and cpu cycles to have a fancy file format, no?

Ofcourse that era Microsoft tools didn't think much about interoperability outside of Microsoft ecosystem. But to me it seems that them having some XML based file format at that time would be silly and have disadvantages. XML actually was started on '96 and published on '98 https://en.wikipedia.org/wiki/XML

.doc seems to be used starting from '89

> The format used in earlier, pre-97 ("1.0" 1989 through "7.0" 1995) versions of Word are less known https://en.wikipedia.org/wiki/Doc_(computing)


Agree. I think most software of the era did that kind of thing.

Most of it wasn’t popular enough and/or got abandoned soon enough for ‘the internet’ to notice, though.

If your software runs in 256kB, writing to a floppy disk, you pack your data structures, use bit fields where possible, and you’re not going to waste CPU time, RAM and disk space to write a format that’s easy to understand and extend.

(Of course, they extended it, anyways, as everybody would)


In the era that these formats were started it was fairly commmon to treat your file format as a dump of memory. It wasn't about preventing competitors from reading it. It was more about the speed of loading it and the lower overhead.

This was before the landscaped had changed so much that the security implications of a format you didn't parse before loading it and the hardware landscape had changed so much that there was no longer a detectable performance hit on parsing a format before loading it.

You had interchange formats which were designed to be used between systems and had a defined parseable formats but if your format was meant to be used by you system alone and not shared then it was believed that a memory dump was strictly faster and better for the the user experience.

We've since learned a whole about why that is not a good idea but it wasn't always obvious in the before times.


I attended a seminar on the office binary file formats about 10 years ago at MS. The reason it was done was for performance reasons, including the wonky layout that made it quicker to save and read the file from slow media like floppy discs.


I also remember reading about that somwhere, sometime... loading... ah, here it is: https://www.joelonsoftware.com/2008/02/19/why-are-the-micros...

> The file format is contorted, where necessary, to make common operations fast. For example, Excel 95 and 97 have something called “Simple Save” which they use sometimes as a faster variation on the OLE compound document format, which just wasn’t fast enough for mainstream use. Word had something called Fast Save. To save a long document quickly, 14 out of 15 times, only the changes are appended to the end of the file, instead of rewriting the whole document from scratch. On the hard drives of the day, this meant saving a long document took one second instead of thirty. (It also meant that deleted data in a document was still in the file. This turned out to be not what people wanted.)


The underlying file format, COM Structured Storage, is basically filesystem-in-a-file, and works much like FAT. So, bits of deleted data would be floating around even without any performance hacks used by the app itself.


They were formats designed to be fast on floppy drives which were major storage format at the time, and most importantly, designed as "work in progress" format.

For interchange, you were supposed to use other formats - for example RTF with Word, which was kept in sync with DOC capabilities all the way to Word 2003 which was the last version that used DOC (2007 uses DOCX and no longer maintains DOC+RTF combo in sync with internal features). However, saving back the file if you did small change in large RTF file took ages in comparison.


Not really; it was an attempt at making fast and responsive software. If you can just blit a file into memory, why not? It's a hell of a lot faster than parsing and validating bytes and then converting them into your runtime data structures. It's a lot faster in the other direction, too.

When all the world runs one architecture, why wouldn't you do that?


No, it's just that if you have to convert you need more memory to load and save, and then you have less memory for the document itself.


It's a global, hierarchical key-value store for quick lookups and insertions conceived in the early 1990s. The types are almost identical to the ones supported in the C Win32 API on x86.


The registry is not a DB or filesystem it is the registry. My main gripe with the post is it keeps comparing it to Unix things.

It is a place to "register" state and configuration to allow other programs and the admin to manipulate that value or make decisions based on it. Not everything hierarchial is a filesystem and while you can argue it is a db, if it is then it is a very specific purpose built key/value storage db.

I won't dissect the post and respond but the registry structure is not a mess, it is well known and different places have different permissions and purpose.


The binary format is rather poor as is the norm on Windows, but you do have to remember one of the goals is that the boot loader can read part of the registry to decide which drivers to load.

The boot loader has to fit in a few sectors so there isn’t really room for SQLite or something.


The Windows Registry is a pretty good key-value store for centrally storing config.

Because some, perhaps many, have abused the originally good idea, does not invalidate its positive aspects.

Also, I do not agree it is a file system. It is a key-value storage system that is leagues better than the win.ini, system.ini, and the prolific ini file collection hell it replaced.

Some form of version control would make the registry better.


It’s impressive because they actually achieved that level of centralization. Most systems aim to have a centralized configuration “place” but it doesn’t stick. With windows the registry really did stick


I always find it interesting when I come across things that seem better from every technical point of view and I hate them, compared to something that seems technically worse.

I spent 15 years maintaining and developing a Windows application and I utterly despised interacting with the registry. It was a nightmare trying to find where a specific key should be written, what the format was meant to be, and browsing the thing with regedit was horrific. And then from a code point of view, win32 APIs for manipulating it were incredibly clunky. I was forever worried I'd accidentally corrupt the whole thing somehow and windows wouldn't boot any more.

Meanwhile on linux I think from a technical point of view, slamming random bits of text into files scattered in random places all throughout the file system in undefined formats is a terrible idea. And I love it. I understand the whole linux configuration and internals so much better because I can browse around /etc and see how every piece of the system is set up.


The article refers to "by the NT debug symbols that one paper has reproduced", linking to <http://amnesia.gtisc.gatech.edu/~moyix/suzibandit.ltd.uk/MSc>, but that link was dead. (If you want your links to be useful to the future, don't title links 'one paper'!) According to the Wayback Machine, it was an M.Sc. thesis by Peter Norris called "The Internal Structure of the Windows Registry" (https://web.archive.org/web/20200808224213/http://amnesia.gt...).


Good enough that d/gconf seemed to like the idea.


i think the windows registry probably sucks because windows was primarily built to enable developers and end users to use microcomputers to make money (where messes are acceptable as long as it makes money) and unix differs in that it was built to enable telephony engineers to operate a telephone network reliably and efficiently.

so if you're an engineer or scientist or care about operations, you like unix style systems because they were built with you in mind. you, like internal bell system engineers and researchers, are the end user.

if you only care about money, windows is your jam because who cares you can always use money to make people suffer whatever horrendous mess that results from using windows and everybody dumping crap in the registry because who cares it makes money.


> unix differs in that it was built to enable telephony engineers to operate a telephone network reliably and efficiently.

This is a corporate revision of Unix's history: it was originally created to run a video game[1]. Later, when Bell Labs gave that team some more resources (in the form of a PDP-11), the first "business" applications they wrote for it were primarily for typesetting and text editing.

Telephony was never a primary design goal for Unix, even if telephone networks were later retrofitted onto it. This is evidenced by the fact that Bell gave early versions of Unix away, as they were forbidden to charge for non-communications technologies and services under a federal consent decree.

[1]: https://en.wikipedia.org/wiki/Spacewar!


perhaps the kernel, but i suspect that much of the userland (which is a lot of what i reference here when i say "unix"- all the command line utilities, shells, the C programming language) was influenced by the needs of both the research community and the technical operation of the telephone network itself.


I'm more than happy to be corrected about this, but my understanding is that the groups at Bell Labs that created and matured Unix were not closely tied into the telephony groups, if at all.

The lack of interest in telephony use cases is evidenced by early releases and "workbench" distributions for Unix: PWB/Unix[1] focused on providing a development environment for programmers, and WWB[2] was aimed at technical editors and writers.

(The functionality of the basic Unix tools reflects this lineage: there a lot of tools for munging text, and very few tools for interacting with peripherals and hardware that isn't a teletype or line printer.)

[1]: https://en.wikipedia.org/wiki/PWB/UNIX

[2]: https://en.wikipedia.org/wiki/Writer%27s_Workbench


Yes, for the large part the computing work at Bell Labs was of a separate and distinct lineage from telephony. The story is somewhat complicated by AT&T's corporate history, as prior to divestiture AT&T was, for the most part, prohibited from selling "computers" as part of the terms of their regulated monopoly. In part as a result of this regulatory situation, the computing work at Bell Labs was viewed as more theoretical than applied. Telephone switching equipment itself, such as the ESS, never ran Unix. This is rather complicated by the fact that some sources (not incorrectly) describe the 5ESS as running UNIX, but in fact the so-called UNIX-RTR they ran was an independently developed operating system that featured partial UNIX compatibility. This was largely for convenience as by that time UNIX was often being used as a development and build environment for switching software.

It's easy to see why UNIX wasn't really involved in the telephone system itself: UNIX was designed as an operating system for mid and minicomputers such as the PDP that featured a largely "conventional" (from the modern perspective, the matter was less settled at the time) Von Neumann architecture with opportunistic scheduling. Telephone switches, back to the 1ESS and back to the XBAR if you choose to view it as a computing machine, tended to be Harvard architecture with real-time scheduling. This was viewed a far more suitable for telephone equipment since it had extremely high uptime and reliability requirements compared to computers. This comes from the different applications: in the '70s computers were still viewed as machines for offline processing in batch mode, where failures were handled by backing out and restarting the batch. Telephone switches were online machines that could not restart their work from the beginning without disrupting calls in progress. These were basically two completely separate lineages of computing machines that had little in common until the '90s, although the PDP itself was an important step in eroding that divide since it was popular for process control applications (this history relates to the reason the PDP was called the PDP).

Keep in mind as well that telephone switching equipment of that era made heavy use of hardware reliability measures (lockstep synchronization of redundant control modules and a large amount of "safety" logic implemented in hardware) as well as running almost entirely from read-only storage (in its most extreme form, the metal punchcards used by the 1ESS). These were further factors that made phone switches dissimilar enough from "computers" that they ran quite distinct software, which usually lacked many of the elements that we would now consider part of an "operating system" (e.g. dynamic process scheduling). To some extent this is still true although e.g. Nortel DMS is more often running off of RHEL-based controllers now.

General-purpose mainframes and minicomputers were heavily used within AT&T but for offline applications such as billing, accounting, and maintenance management. For example, many electronic exchange switches were originally paired with a PDP-11 class machine that tracked maintenance and fault data for the office to replace the original paper tickets. Over time these machines gained more capabilities such as automatic line testing, but they remained "external" to the phone switch for reliability reasons. Billing was, for the most part, done by physically removing the tapes from the switch's data recorder and loading it into a computer to totalize and generate bills. Later on this process was automated but still with a clear separation of the switching system and the billing system.

Post 1984 AT&T, no longer a regulated monopoly, was permitted to compete directly with IBM, Lexmark, etc with their computing division. The venture was an infamous failure, and AT&T never saw real traction as a computer vendor. AT&T's rapid technical development but failure to actually build a sustainable computer business is what brought us the UNIX ecosystem: they built it, but they couldn't sell it, so eventually they basically gave up and licensed it to anyone with a pulse.


good question. i sent al aho an e-mail. hopefully he'll reply with some interesting history. watch this space!


Was there anything specific about Bell Labs or working for the Bell system that influenced the design of UNIX and the userland tools? Did the needs of the Bell system influence any of this work, or was it pure computer science research (and the needs of the computer science research community) that influenced these designs? If awk were "designed for someone" who was it designed for? Did anyone who worked on UNIX or the computing stuff at Bell labs care about building things for telephone network operations or was it pure research and technology development for the art of it?

--

Good question! Many books have been written about why Bell Labs was so successful in creating research innovations that changed the world.

Bell Labs was interested in building operating systems even before the 1950s for use in AT&T's operations support systems for the global telephone network. (You might look at the Wikipedia article on BESYS.) In 1964 Bell Labs joined with MIT and General Electric to create an advanced operating system called the Multics project. (The article Unix and Multics at https://multicians.org/unix.html provides a lot of useful background. Also see Dennis Ritchie's article, The Evolution of the Unix Time-sharing System, which is a must read on the early development of Unix.)

In the 1960s Ken Thompson at the Computing Sciences Research Center at Bell Labs, Murray Hill, NJ worked on the Multics project. When Bell Labs pulled out of the Multics project in 1969, Ken Thompson on his own decided to build a much simpler operating system which became known as Unix. Dennis Ritchie, also in the CSRC at Bell Labs, joined Ken in creating Unix. Dennis invented the C programming language for this effort. Subsequently, many people in the CSRC and elsewhere contributed to the development of Unix. Doug McIroy (Ken Thompson's boss and the inventor of coroutines and pipes) deserves a lot of credit for shepherding the development of Unix.

The reason Unix and C became so successful is that they were designed by individuals with very good technical taste and not by committees. The Bell Labs research culture also let individuals have great discretion in determining what direction their research should take and one of the important functions of management was to provide adequate resources to make research projects successful and ultimately beneficial to the development of the global telecommunications infrastructure. Another motivating force for Bell Labs research was to create a patent portfolio for the company that could be used to get access to patents of other companies by cross-licensing.

What I found immensely gratifying by being involved with Unix was what Don Knuth had once told me. He said, the best theory is motivated by practice and the best practice by theory.

As for the motivation behind awk, I recommend looking at https://www2.computerworld.com.au/article/216844/a-z_program...


My impression from a past life working with the innards of Windows was that Microsoft weaponized terrible software engineering practices into a business advantage. The awful underlying implementations made it very hard for competitors to reverse engineer anything and build compatible products. Sometimes, it even made it exceedingly hard to write drivers for Windows because of unresolvable race conditions in the kernel.

Some of the best examples are CIFS/SMB and DCE/RPC over SMB which were bottomless pits of awful.

One of my favorite anecdotes is the SMB print service described as the work of a clueless college student by one of the Samba developers who reverse engineered it.

Microsoft cleaned up very quickly once DOJ forced SMB 2.0 (or whatever it was called) to be documented.


The purpose of the registry is as an obstacle to simplicity.


In theory it is a great concept. "lets put all the configuration in one place" in reality it ends up being super annoying, It is hard to say why, but after a bit of thought I think my main objection to the registry is that it is yet another tree but this one is out of band. windows is already bad about this, what with the drive letters, and the way the shell(explorer) tries to present it's own magical imaginary view of the filesystem. the registry just takes that and dials it to 11, now you have a special tree, that needs it's own specific tools to use.

I think one of the key innovations of unix was the unix filesystem and the way it made everything one tree, one unified interface for everything is an amazing concept, In fact I think that the real innovation of unix was it's simplicity. now later this was messed up (yeah berkeley, I am putting most of the blame for that on you) with sockets, sysctl, etc. Or in short every unix interface that ignores the one true API, the filesystem(which is for the most part open, read write, seek, close)

In conclusion, whenever I see a tree but whoever made it decided that no, we are a special snowflake, and are not going to attach this tree to the main tree of data on your system(the filesystem) it bugs me. The rogues gallery here include gnome config, sysctl, dbus, and yes the windows registry.



Thanks! Macroexpanded:

Why the Windows Registry sucks … technically - https://news.ycombinator.com/item?id=1134307 - Feb 2010 (61 comments)

(it's on my list to write software to do this automatically for links to past threads - or maybe to offer users the option when they post a comment with HN links in it)


The article was shared yesterday by the author in a comment on yesterday's discussion "Embedding an EXE inside a .REG file with automatic execution":

https://news.ycombinator.com/item?id=32249845#32269623

Also mentioned tools for working directly with registry files from Linux (including within VMs).


I have worked with Windows registry for at least 2 decades and it is one of the things I rally hate about design of Windows (and I like the way config files are kept in Linux). A very big virtual file system (with a parallel access API of its own) inside a very big (at that time, slower) file system (and its own API). I wish Windows limited its config to only pure file-system file. Simple and flat, instead of this mess of over 100 MBs which has been itself a source of many vulnerabilities. The legacy baggage is also terrible.


Text files are better because I can use any text editor to update or change them, and leave comments/notes in the file.

I don't understand all the talk about performance - programs should be reading config once on start up and that's it.

Registry keys I guess serve the programmer better by providing types and paths, but user space libraries can be used and since it's not kernel-level string handling code, a lot safer. Then again for I know ntdll.dll or kernel32.dll or whatever does actually run its registry API functions in userspace.

Also, since the registry doesn't provide an opportunity for self-documentation, updating configuration using a common, trustable tool (regedit) is generally something you do only as a last resort when armed with knowledge or documentation that you'd have to hunt for elsewhere. Realistically you'll have to use control panel applets or settings dialogs in programs themselves.


> I don't understand all the talk about performance - programs should be reading config once on start up and that's it.

non-microsoft programs were not intended to use it when it was designed. the performance considerations are for windows itself.

Microsoft's guidance has always been for third party applications to manage their own configuration in whatever way they choose, so long as it isn't the registry.

developers ignore that, use the registry, and blame Microsoft for keeping the registry around even though Microsoft promises to do everything they can to keep things backwards compatible.


non-microsoft programs were not intended to use it when it was designed. the performance considerations are for windows itself.

That's not what I remember. Someone at Microsoft had a personal jihad against application-specific .INI files scattered all over the place, and came up with the registry as a centralized solution.

I still use .INI files myself, although Microsoft did successfully manage to get me to stop putting them in the executable directory where they belong. :-P


About performance> try monitoring a non trivial Windows program with Procmon during it's start. I found the intensity of registry activity is quite extreme. Maybe many of those operations (many are by Windows itself) are not necessary, but in the current state, I feel happy that the registry is designed to be performant.


The alternatives sorta fail the KISS principal since in general the registry grew out of the win3.x .ini config API's which are sorta the equivalent of the files in /etc, but the format was more regular.

Given its mostly read only data, I suspect no one thought that it was going to grow into the huge monstrosity its become, yet at the same time, unless your looking for a full ACID database, and your looking for a lightweight solution it would probably sit a lot closer to the registry API and format than it would something like sqlite.

Someone at some point should have probably said enough, and duplicated some effort and kept the registry for OS/etc settings while splitting off the people who want something closer to an ACID database into an actual database. Of course that has happened a couple times now (aka MDB?).


I rather have a registry than bunch of INI/JSON/XML files and each program bringing its own parser


/etc files have the key advantage that you can extensively document the conf for fellow sysadmins (yourself included) right there next to the actual setting most of the time.


https://www.libelektra.org/home is an attempt to bring a registry to linux. It doesn't have the problems of its windows equivalent. I wish this project was more popular.


Why?

The windows registry is a disaster.

The whole idea is wrong. You want to keep configuration with the thing being configured.

Most Mac installs are "drag the folder on to your disk". You are done.


Microsoft didn't even intend to use it for long. developers discovered apis for it and started using it despite it being unsupported at the time.

Microsoft's extreme backwards compatibility decisions have kept it around since then, and now since it is still supported, it is still used.

the registry was first intended to be an internal implementation detail while something better was designed. us hackers ruined that plan, but we're still happy to blame Microsoft for their compatibility promises and our own misuse of the registry.


I agree. I much prefer my software to store it's configuration in AppData in Windows and in the user's home directory in Linux.

It makes migrating all your app settings to new machines or fresh installs way easier.


Isn't app configuration in Mac kept in property list files under ~/Library/Preferences/?


Windows 10 and 11 have moved away from the registry for applications.

It's in

    C:\Users\<username>\AppData\Local\<AppName>
for per-user information

I haven't had to go into the registry on Windows 10 or 11 except for enabling beta/early access features.


>I haven't had to go into the registry on Windows 10 or 11 except for enabling beta/early access features.

I have to edit the registry on every Windows install to have: sane multi-monitor config, working macro keys (suppress the "office" key advertisement in my $300 copy of Windows), suppress OneDrive Personal on a machine that syncs to an O365 tenant, have a functional file explorer, and I'm sure a half-dozen other things that don't come to mind immediately. (On top of numerous group policies, which are mostly just sanctioned registry tweaks, along with programs to removed telemetry from 10/11 which I'm confident do plenty of registry tweaking on their own.)

This is all to control first-party functionality; before I've even installed a third party program. So, to the contrary, I can't remember the last time I went into %APPDATA% to tweak something Microsoft-related.


That's a new alternative to the registry, and some prominent programmers like Raymond Chen have promoted using it, but it's still up to app developers which they want to use. I doubt that the registry is going away any time soon.


%AppData% and its roaming friends have been around since Windows 95. It's not "new". It's just that we've finally gotten some developers to understand what it is for this many decades later, with one of the big humps being when Vista added UAC and security things that should have been obvious in documentation were finally enforced.


That's been around since Windows NT. Whether or not Microsoft uses it for their own (in-box or out-of-box) products is a different question.


AppData\Local or AppData\Roaming, depending on whether the setting is machine-specific or machine-independent.


The Windows registry is sort of a copy of the Mac's "resource fork". The original Mac had files with both a data fork, the file contents, and a "resource fork", which is a tree-like database. Preferences and such were stored in resource forks. This was a good idea, implemented badly. Because it was originally designed for floppy disks, writes were very expensive. So resource forks were left open in an inconsistent state after modification, to be closed at program exit. After a crash, resource forks were damaged. Unfortunately, this flaw persisted long after the Mac line got hard drives and more speed.

Apple was on the right track with the concept that most programs needed a database for their state, but, because of the cram job needed to squeeze the original MacOS into 128K of RAM, got stuck on a bad design. Microsoft didn't have that excuse with their Registry.


files on NTFS also have alternate streams. this is how Windows knows you're executing a file downloaded from the internet via internet explorer, for example.

it's easy to hide stuff in those streams as well.


There's the CurrentVersion key that everything seems to be under


>The Registry binary format has all the aspects of a filesystem: things corresponding to directories, inodes, extended attributes etc.

Like? Always thought it was an hierarchical database.


>> The Registry binary format has all the aspects of a filesystem: things corresponding to directories, inodes, extended attributes etc.

> Like? Always thought it was an hierarchical database.

A filesystem is a hierarchical database.


I probably miss something then. The article also states that it is not a database.


It is a long standing debate, see this thread:

https://news.ycombinator.com/item?id=27939728

basically it has some characteristics of a database and some characteristics of a filesystem (and a filesystem is actually a particular form of database), not entirely unlike the "over or under":

https://en.wikipedia.org/wiki/Toilet_paper_orientation

it is an endless one, both views have their merits (but it is "over" and a filesystem ;))


It’s not a relational database is probably what they meant.


That almost 30 years later windows users still have to suffer with the registry is the bitter price of backwards-compatibility.


I think articles like these miss the point that Microsoft's stuff is not intended to be designed in a predictable way. Their goal, above all else, is to protect their monopoly. Making things convoluted, unpredictable, and undocumented is just one way of doing that. Silently failing, and non-specific error messages are another.


When a problem like this has equally bad solutions between Windows or Linux (or macos) it means the problem really isn't solved yet. This is kind of exciting, actually. It is one of those spaces where the XKCD "standards" joke plays out, but there really is an opportunity for someone to come along and solve it.


fun fact: Blender save files are also dumps of memory to disk.

mostly.


Is the registry used for IO - like The mac IO KIt ?


With zero solid evidence to support my belief, I am certain that somewhere in Redmond there is a Windows instance, running "Office," atop a Linux kernel.


I'd be more inclined to believe in just a Linux instance running Office.

The NT kernel is so incredibly different from the Linux kernel, and the Windows shell takes such extensive advantage of it, that I can't even begin to imagine the compatibility layer required.


Apple supported OS 9 running 'containerized' atop what was essentially Openstep for a few years.

Worst-case I could see a Windows GUI running atop a heavily-symlinked Debian, and all Office things running Web-based. Everything else that can't run native could be in obscured VMs. At some point, the NT kernel isn't going to cut it.


> At some point, the NT kernel isn't going to cut it.

If you'd like, would you please say more on this point? What leads you to believe that?


This article is arguably better titled "Why Windows sucks technically" or even "Why Windows sucks"


> This is a far cry from /etc/progname.conf in Linux.

Some of these arguments are not really in good faith. First, this isn't a technical issue with the registry, but with how it's been used over the years and never "refactored" - a distinct issue. Second, I dare you to run "ls /etc" and claim it's not a mess...

Besides, most of the post is how anyone with a text editor and admin rights can bork/hack the machine. As if someone with admin rights couldn't bork/hack the machine in a billion ways. Or if someone with admin rights on linux couldn't write garbage to /dev/sda1 and "hide" something from the OS.


I mean, yeah - /etc is messy. Every configuration file in /etc has yet another bespoke configuration language. And somewhere on your system lies a corresponding buggy, half implemented parser for it.

But /etc has gotten messy in the same way a desk gets messy. There's documents everywhere, but if you pick anything up and take a look at it, you can usually (with the help of google) figure out what that file does and how you can change it. It helps that almost every file in /etc is owned by a single program or library. And just about all of those programs have documentation. (Or, at worst, source code).

In comparison, the windows registry feels like an old, disorganized community's storage space. On first glance things look organized, but actually there's been dozens of half hearted attempts to organize everything over the years by different people, each with their own idea of where everything should go. You don't recognise half the stuff in there. Random ex-employees have been dumping random objects in there for years. But its impossible to tell at a glance what random objects are critically important, and what is trash. Everyone else has this problem too, so nobody throws anything out, and its just accreted.

Braver people than you have fallen on their swords trying to tame the windows registry. There be dragons.

I'd take /etc over the windows registry any day of the week. /etc is messy at the surface level. The windows registry is messy like a fractal.


> . And just about all of those programs have documentation. (Or, at worst, source code).

Having a documentation has nothing with reg/etc, win/nix or whatever.

You have documentation for something in /etc? 99% that came from the distro packages.


I love editing files in /etc ending in conf! Some are pseudo-INI (OpenSSL), some are pseudo-XML (Apache2 and friends), some are some kind of unholy amalgamation between C and YAML (nginx, dhcp daemons), others are some kind of TOML derivative (systemd, NetworkManager) and there's also some diet JSON in there! Netplan uses YAML, of course, though JSON will probably also parse. It's always a fun adventure to reverse engineer these file formats.

Of course, most important files contain whitespace separated lines of configuration of which the meaning is only clear if you read the comments above it (cron, fstab, crypttab) or secretly bash scripts (GRUB config and many other files in /etc/default).

Then there's user configuration stuff. Sure, HKLM vs HKCU is kind of weird, but the user configuration structure in the Linux home directory is an argument against intelligent design. ~/.config, ~/.programname, ~/.local or ~/snap/<name>/current/<randomfile>? Roll the dice and find out! You may be able to find the setting you're looking for in the DConf Editor but if you can't find it there, grep and prayer is your best bet.

The Windows registry may be the result of a flawed execution, but I'll take it over the mess in Linux any day. Sadly, the registry has fallen out of fashion in Windows so now managing random configuration files somewhere in %USERPROFILE% (if you're lucky and the programmer hasn't hardcoded C:\Users\<username> as a path) just like on Linux. By the way, who needs the "hidden" FS flag anyway? Just start the filename with a period and pretend the ls-bug-dressed-up-as-a-feature for hiding files and folders is a standard every platform sticks to!


> some kind of TOML derivative (systemd, NetworkManager)

A bit off-topic, but it's funny to me that you identified these as a "TOML derivative", when both of the mentioned pieces of software predate TOML. Systemd uses D-Bus's flavor of INI. IDK if NetworkManager uses exactly that same flavor or not, but it's pretty similar at least.


Fair enough! I don't know the exact formats and their history, all of that was developed long before I started really using Linux.

Either way, when I edited the files for the first time, it's like "oh, it's like TOML but...". They're not necessarily bad, though nesting of configuration groups isn't always as obvious. I suppose these are part of the evolution of Linux config file design!


> I love editing files in /etc ending in conf! Some are pseudo-INI (OpenSSL), some are pseudo-XML (Apache2 and friends), some are some kind of unholy amalgamation between C and YAML (nginx, dhcp daemons), others are some kind of TOML derivative (systemd, NetworkManager) and there's also some diet JSON in there! Netplan uses YAML, of course, though JSON will probably also parse. It's always a fun adventure to reverse engineer these file formats.

Sure it would be great if all configuration was written in a single configuration format. Life would also be so much easier if there was only one ultimate programming language so we would only have to learn one. Unfortunately reality is messy and for a variety of reasons we have different formats. As a side note, I don't think I ever ran into a syntax error for a system configuration file, because I couldn't figure out the format. The first time that happened to me was when editing a yaml file for CI.

> Of course, most important files contain whitespace separated lines of configuration of which the meaning is only clear if you read the comments above it (cron, fstab, crypttab) or secretly bash scripts (GRUB config and many other files in /etc/default).

Well there is "man fstab", "man 5 cron" etc., so the file formats and options are formatted. However the complaint that the default files are so well commented that one understands what the different options are without actually reading the docs is just weird to me. How is that worse than some random registry key that is not documented anywhere without comments (they are not possible AFAIK) and without knowing what type they should take.

> Then there's user configuration stuff. Sure, HKLM vs HKCU is kind of weird, but the user configuration structure in the Linux home directory is an argument against intelligent design. ~/.config, ~/.programname, ~/.local or ~/snap/<name>/current/<randomfile>?

Which program saves configuration inside ~.local? I'm not aware of any and they generally should not. Also adding adding snap to that list is a bit besides the point, I don't think containers typically save their configuration in the registry either?

> Roll the dice and find out! You may be able to find the setting you're looking for in the DConf Editor but if you can't find it there, grep and prayer is your best bet.

I actually dislike that gnome uses dconf (which is a better organised registry), I think it would be better if they instead would have text files as well. I understand it's largely for performance reasons.

> The Windows registry may be the result of a flawed execution, but I'll take it over the mess in Linux any day.

So how do you know what registry keys to change to achieve a certain configuration (without documentation, because you didn't like that on Linux either)?

> Sadly, the registry has fallen out of fashion in Windows so now managing random configuration files somewhere in %USERPROFILE% (if you're lucky and the programmer hasn't hardcoded C:\Users\<username> as a path) just like on Linux. By the way, who needs the "hidden" FS flag anyway? Just start the filename with a period and pretend the ls-bug-dressed-up-as-a-feature for hiding files and folders is a standard every platform sticks to!


> Sure it would be great if all configuration was written in a single configuration format.

I think the root cause is there is and was no readily available system standard/interface. I would claim that the majority of the files are doing boring things and could have, and would have, been made in some standard format, and possibly placed in some standard organization, if a system standard had existed. Or, perhaps I've too heavily discounting the desire for nerds to create something new, even when reasonable, but not perfect, alternatives already exist.

Systemd showed me how homogenization and standards are generally disliked, when it comes to things around configuration files, even if there's absolutely massive utility.


> As a side note, I don't think I ever ran into a syntax error for a system configuration file, because I couldn't figure out the format

I remember struggling to get some open source VPN package to work right. I swear I've read through the manuals three or four times but it just wouldn't work and the error messages were meaningless. I know most of the obscure formats now out of experience, but it's still not great. DNS BIND configuration and zones (and the accompanying AppArmor configuration) also caused me more trouble than necessary. Configuring and debugging inetd configuration was also something I never hope to do again.

> Well there is "man fstab", "man 5 cron" etc

Of course there is, but those aren't exactly light reading. It's not that the formats are difficult to find, it's that there's no consistency to the formats themselves with nameless columns that you need to figure out by reading several paragraphs of text.

> Which program saves configuration inside ~.local? I'm not aware of any and they generally should not. Also adding adding snap to that list is a bit besides the point, I don't think containers typically save their configuration in the registry either?

In my ~/.local/etc I see fish and bash_completion.d, for example. I don't know what programs created those, but they're there. I see some PipeWire files in ~/.local/state. ~/.local/share contains many files ending in ".conf" (most of them Flatpak, but also Kodi and GnuPG). I prefer them stuffing their config in .local rather than ignoring XDG, though. IMO Snap fits the list perfectly because it's not sold as "containers", it's sold as an "app store" even though it's focused on GUI containers. And I'm still mad about Snap not even bothering to use a capital letter in their home directory name.

I like dconf in that most settings are actually documented in context. They're searchable, use a standard format, and have structure. The ability to distinguish default values from custom settings is also very nice. IMO it's the Windows Registry but with a good editor and an even worse API.

> So how do you know what registry keys to change to achieve a certain configuration

Most of the time, I can use the find tool to find exactly what I need in the registry. I barely need to touch it anyway since most Windows settings are configurable from the GUI. Sure, it's certainly not the best config management system, but it's design is much better than the hodgepodge of configuration files unrelated systems use.

I admit that on disk, the Windows registry format is atrocious. However, the concept of a single, unified, backuppable, remotely manageable configuration system is just much better than "let's stuff some files in /etc and let people find out by making them go through our docs". I don't know where to look for config files on Linux, on most Windows tools I at least know what program I need to open to edit the configuration. I'd like DConf or at least XDG to get used more often because IMO the Linux ecosystem is the worst of two options.


> Some of these arguments are not really in good faith. First, this isn't a technical issue with the registry, but with how it's been used over the years and never "refactored" - a distinct issue. Second, I dare you to run "ls /etc" and claim it's not a mess...

I'd suggest that last bit is not uttered in good faith.

20+ years ago you could install the Microsoft Office suite on the Microsoft OS du jour and you'd find about 12k new registry entries.

You could then uninstall that same Microsoft Office suite from your OS, and ... you'd find no decrease in the size of your registry.

In the same epoch -- if you installed a hundred new applications on your Debian GNU/Linux OS, you may end up with several dozen new directory hierarchies under your /etc directory. If you then uninstalled those applications, the /etc/ entries would be gone. If they weren't, you'd file a report on the Debian BTS and that oversight would be resolved within a few months.

For this reason I think it's disingenuous to claim running ls /etc/ is a mess that's equivalent, in any sense of the word, to the microsoft registry.


I'm sure MS has enough issues with backwards compatibility. Can't imagine handling multiple registry implementations on top of it.


If your argument relies on how things were in late 90s - early 00s, I don't know if it's really worth anything today.


That's probably a reasonable retort.

My response was to the claims that we aren't addressing the technical issues with the registry -- that is, if Microsoft was unable/unwilling to handle registry hygiene, what hope the rest of us? I have not run the same test on a modern OS / Office install and uninstall, but at the risk of exposing my biases, I'm not hugely optimistic.

Tangential aside - at the time I was working with a large Australian telco where we'd developed a SOE with robust roaming plus (effectively) package management for ~1500 desktop applications on a Microsoft Windows 3.x platform, all of which was largely kiboshed by the introduction of the registry.

Secondly, I feel I've already addressed your 'at least it's not as bad as the /etc mess' by pointing out that /etc is not a mess on some, perhaps all, GNU/Linux distros with a good package management system.

To paraphrase - if your argument relies on how slackware handled /etc in the late 90's, I don't know if that's worth anything today.


> Second, I dare you to run "ls /etc" and claim it's not a mess...

In principle, sure. But in practice, using /etc to edit configuration data is not that hard, and every Linux user does it all the time. Different software uses different configuration formats, sure (though it's worth nothing that NixOS is doing really clever things to solve that problem), but it's really not that hard or scary.

Manually editing the registry though... That way lies madness. It's much scarier and it's much more fragile.


> Manually editing the registry though... That way lies madness. It's much scarier and it's much more fragile.

How... exactly? There's no way to create a syntax error as trivial as you can in config files. How is using regedit or powershell commands more fragile (I understand "scarier" in a way everything you're not used to is scary).


Well the article points to a way of preventing following entries to be read by having an entry in non alphabetical order. But how often have you ran into syntax errors for system configuration files in Linux? I can't say that this has been much of an issue in my experience.


Start making raw edits directly to an ext4 disk rather than using published libs/APIs, and you'll soon find some new horrors to be scared about.


That's because the article is about doing binary edits to to the registry files themselves, something no application or user should ever do. Instead changes should be done through the API or the regedit program.


> Second, I dare you to run "ls /etc" and claim it's not a mess...

In unix man my.conf will work for most sane programs.


Offhand, doesn't this kind of perfectly explain systemd fear? Probably what people had in the back of their heads.


(genuine question) What was the historical fear with systemd? That behaviors get hidden behind binaries rather than shell scripts? Has the fear been realized?

All systemd units can be edited easily, they're files in a filesystem. Journald logs are binary but journalctl gives you ways to output it.


That's what I'm describing as a reaction at the time even if it didn't turn out to be a big deal.

I'm a long time Linux guy and when I heard about systemd, the first thing it made me think of offhand was the Windows Registry. Now I'm not super-deep on that level of Linux but I could understand the knee-jerk reaction at the time. But you're correct technically, and I think time has proven that there's not much of an issue.


How so? systemd uses text files for its configuration, documented in systemd.syntax(7).


I know. That's why I've been trying to describe it more as a "feeling" than logical?


Therein lies the secret of corporate success. This was not dobe by an expert - neitger by a bearded guru who learned things the hard way nor an academician who studied databases and data representation, but by a reasonably competent employee. Sort of like a corporate equivalent of the lowest bidder. Thus, the code does only what the immefiate requirements were at the time.


And yet the registry is used by billions of people, everyday, without issue. So the technical problems must be minor in practice, right? This is all very academic.


Very dismissive take. How many billions of dollars in man hours have registry errors created with the only solution often being to reinstall windows.


But how can we know if the part that matters is real or not? How many billions of dollars due to registry-specific errors that wouldn't have happened if the app simply used a text file? That's a quantity we can't so easily come up with.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: