Hacker News new | past | comments | ask | show | jobs | submit login
A reasonable configuration language (ruudvanasseldonk.com)
141 points by todsacerdoti 11 months ago | hide | past | favorite | 109 comments



Reading through this I'm reminded of another article from about a month ago, An app can be a home-cooked meal [0]. Whenever someone starts on a new language or framework, people are quick to ask "why" and bemoan the proliferation of languages and frameworks. I think this article is a good illustration of the "why".

What the author understands that so many don't is that a language can be a home-cooked meal. A programming language is nothing more or less than a tool for use by a programmer. Compilers have a lot of mystique about them because of all the crazy optimizations that production-grade compilers put in, but fundamentally a compiler is just a pipeline that transforms a data structure from a format that the human would prefer to interact with into a format that a specific machine can work with.

rcl is a home-cooked meal kind of configuration language. It was never intended to serve a wide audience, it was intended to solve a specific human's pain points and help that specific human with their task. Its value lies in that it doesn't need to try to be anything else.

[0] An app can be a home-cooked meal (1051 points, 288 comments) https://news.ycombinator.com/item?id=38877423


> a language can be a home-cooked meal [...] a tool for use by a programmer [...] a pipeline that transforms a data structure from a format that the human would prefer to interact with into a format that a specific machine can work with.

You're neglecting the second function that a language serves, which is as a communication device to communicate a program from one human being to another, and in that second function, a language is utterly useless if it lacks critical mass in terms of how many people "speak" it. This runs completely counter to the "home-cooked meal" model.

You run into problems, even if you only have one human being trying to communicate the program to their future self, because there's a tendency to forget a language you don't use regularly, so you might cook up a very brilliant language to solve a problem that you have right now, but that you don't have sufficiently frequently. So you don't practice the language regularly, and then you come back 10 years later and want to change something about the program and you might find yourself in trouble.


I'm working on something that may end up being rcl-like.

My intention is to make it something that can update existing configuration files just as well as -be- a configuration file, such that if somebody finds it useful, they can use it without their colleagues needing to be aware and with the same commit diffs as if they'd done the same work by hand.

How well that will work in practice is still an open question, though a bunch of my experiments in that direction seem to have worked out nicely so far.

But so long as it helps -me- make changes to a repository that are good, so long as I can ensure nobody else working on that repository needs to care that I used it to help, I figure it's worth the attempt.

(also, hey, I'm having fun :)


I know the standard response is "why another one?" but honestly, his diagnosis of the existing options is pretty good. He even uses the best examples of thoughtfully designed config languages: cue, dhall and nix, and ... yeah I pretty much agree with his qualms about those languages.

What is a config language really? It's a language that doesn't allow side effects and evaluates to json.


Sure, it doesn't allow side effects, but what can it depend on? If a file defines a language that downloads a dependency from a URL, does that count? How about if it's automatically cached?

Once you start getting dependencies from others, it seems like a configuration language could drift into package management.


Downloading anything has side effects and can't be allowed by side effect free language. It is pretty rare that config language allow downloads.

Downloading dependencies isn't usually done by config languages. Config languages generate a config, JSON or YAML, from source. The config file can be used by engine that applies the config including downloading dependencies.


Thinking inside the program, downloading dependencies certainly doesn’t need to have side effects. It may add new failure modes, that’s all.

If you step outside then sure, it has side effects, but so does simply running the program (any program), in 100% of the cases.


XML isn't rare and it has URL's baked in. But yes, it's a bad idea.


Do you really want github actions have no side effects?


The configuration language evaluates to YML that defines a GitHub action. The configuration language doesn't run GitHub actions.


The configuration rabbit hole, for practically any system that becomes widely applied or through large amounts of feature iterations. With provisioning systems you rapidly walk the ladder because they are basically starting about five steps down on this already:

You start with an INI/yaml/json

Then you might have several overlaid INIs

Eventually: https://docs.spring.io/spring-boot/docs/2.1.13.RELEASE/refer...

Go ahead and make a java crack or something similar. Every line of that hierarchy is bathed in the blood of IT workers in the real world.

Wait! Not even close to being done.

Templating, expansions, "classes", embedded expressions: now we have computation folks!

I don't actually know the computational power of HCL, but eventually you end up with a Turing Complete configuration input that the poster wants.

And I'm actually leaving a lot of things out.

So many systems go through this, picking their own path. Kind of like workflow engines, almost all major systems will end up with a workflow engine in it (and the builds are workflow anyway). But there aren't well-conserved implementations of these overarching meta-patterns, so everyone bespokes it or picks from a mind numbingly vast array of pseudo-matching options.

Good luck!


I defined myself 5 levels:

Level 1 is just values in a file. The Linux kernel uses that.

Level 2 is a list of values, e.g. ini files.

Level 3 allows nesting. JSON, XML, and YAML are here.

Level 4 allows computation but limited. Dhall and Starlark are here.

Level 5 is a Turing-complete language. Python, Javascript, etc.

RCL seems to be level 5, so I'm not sure if there is really an advantage compared to Python.


I think the goal is to have a language that feels like it was designed for level 3/4 but allows you to break out to level 5 as an escape hatch.

Given how many configuration languages eventually end up with a half-assed level 5 (Greenspun comes to mind) it strikes me as, at least, an experiment worth performing.


It's scary to think how many times I've had to evolve configuration options as systems developed and the user-base changed.

As you say, I'd start out with INI files, then move on to JSON, after that it would come a hack to allow the JSON to be generated by executing a shell/ruby/perl script. (i.e. "--config=!xx" would execute XX and parse the output, instead of reading a file).

Later still we started embedding lua, or similar, to allow things to be templated and "dynamic" on a per-host basis.

I was always fond of the Apache-style configuration system, and I guess HCL is close to that, but there aren't any great universal solutions unless you go all-in with scripting, and then you end up with emacs!


I wonder if anyone would take me seriously if I suggested WASM as a configuration language. When run, the WASM would be given imports that provide a small API for manipulating the config DOM, and importing and executing further WASM files. Then you can use whatever high-level config language you want (or, say, a subset of Rust) and compile it to WASM.

Existing JSON/YAML/etc could be compiled to WASM that simply builds the corresponding DOM.

Funny joke?


I think this is what the zellij terminal multiplexer uses for extensions.

It's worth mentioning that in the opposite direction, if you wanted to have an interpreted config option with fewer compile steps, there's also the option to use lua — a language for which the compiler itself is quite small and portable

See,

[1] : https://zellij.dev


using wasm goes into crazy territory imo. It would probably be bigger, and opaque, with no proper errors (unless you compile debug info which would make it much bigger).

I think we already have an almost perfect language i.e. Lua Everyone knows it (or can learn it easily). tiny runtime to embed. sandboxed by default. garbage collected. Only feature missing is static typing.


Hum, most systems do not go down that hole at all. Most software stop at most at hierarchical parameters.

For the ones that do go down the hole, you are almost always better on separating them into components, to minimize the ones that need complex configuration, and accelerating the journey of those small pieces. (But yeah, it would be nice to settle on a good workflow description language that isn't a makefile.)

Now, it seems that everybody that is pushing those high-complexity languages wants an infrastructure description language, and just keeps calling it by "configuration language". Since those two problems are completely different, insisting on using the wrong name is quite harmful to the goal.


The article did not advocate for a TC config language. I think Ruud would probably have been happy with a total primitive recursive language.


To me the distinction is not important. It is possible to write programs that take too long to execute in non-Turing complete languages, and when a program hangs, you just terminate it. I think what people really want is for their configuration to remain simple, but not being Turing complete does not guarantee that. For RCL I added a limit on the number of evaluation steps, because without it some programs trap the fuzzer.


We agree. You don’t want universal computation in your configuration. Your “gas” approach is the right one and very weak computationally. I further posit that you don’t even want to offer gas fuelled TC and everyone would be happy with a provably total primitive recursive language fuelled by gas. Why? Accidents happen. I’d rather have my configuration tool tell me statically that some construct isn’t provably terminating than wait to find out later when someone imports my config snippet library and uses it in an unexpected way.


Configuration languages are an interface to the API presented by the software you are using. There can be a hard boundary between the two — a C codebase for a web server and the XML-like file that configured it — or there can be an imperceptible transition between the two where the configuration is just another module in the code base.

As one of the authors working in a 2000+ module Python codebase, nothing gives me greater pleasure than to drive one part of the codebase by creating another module. Nothing gives me greater sadness than to be forced to interact with something through a YAML config file, command line flags, or launching GitLab pipelines. All three of those boundaries interrupt the powerful electromagnetic force fields that pervade the system: type checking, linting, and symbol finding (IDE integration). In time, I vow to destroy every last one of these non-code boundaries. Another unwanted-boundary demon on the exorcism list is polyrepos.

There once was a time when the world worked like this but instead of Python it was C. It wasn’t as rich as the dynamic world we have today — I’m certain I don’t want to go back to configuring my softest by recompiling it — but it did have a lot of the advantages of working in one language environment across all the things.


Awesome description of really fundamental issues. It takes a lot experience to come to these kind of conclusions and people who haven't dug deeply into large enough code bases will never really believe it until they see it, and even then it can take a few rounds. So the config layer complexity keeps coming back while naive but well intentioned folks are striving in earnest for simplicity. It's tricky to argue the point also because either POV can sound like best practice, and decisive action in one direction also doesn't have effects that are immediately visible. It's a pretty long game to see what works best and we never feel the pain that we manage to dodge.


Diff'rent strokes for diff'rent folks (and diff'rent use cases)...

For me, the panacea is to be able to easily do both. I want to be able to use an application as a library from a repl or script by "configuring" it via dataclasses. And I want to be able to run the same application from a cli in different configurations in different production environments by checking in sets of config files that I know are more constrained in what they can do than a generic python runtime.

There's no fundamental reason it is hard to expose both of these interfaces without repeating yourself a bunch!


Interesting, I’m in the middle of a huge project to move us AWAY from compiling a separate binary for each environment. In our case, I don’t see how hardcoding config values can play nice with infrastructure as code tools, or the need to offer an on-premises version of the app.


This, but with Javascript. Perhaps the 'electromagnetic force fields' are less powerful, but the array and object literals are perfect for configuration.


jsonnet[1] and kapitan[2] are the tools I currently use. Their learning curve is not optimal (and I tried to contribute to smoothen it with a jsonnet course[3] and a 'get started wit kapitan' blog post[4]), but once used to it it's hard to do without, and their combination makes them even more useful (esp. if you deploy K8s).

In Ruud's case, Jsonnet might have been worth looking at as Hashicorp tools can be configured with json in addition to HCL. But that would have been less fun I guess ;-)

I hope for Ruud it finds its niche, there's quite some competition in this field!

1: https://jsonnet.org/

2: https://kapitan.dev/

3: referal link: https://www.udemy.com/course/jsonnet-from-scratch/?referralC...

4: https://www.yvesdennels.com/posts/starting-with-kapitan/


Kapitan founder here! Thank you for the promotion

Happy to help if you need anything

https://www.linkedin.com/in/alledm


The author mentions Jsonnet in the appendix[^1]:

> I never properly evaluated Jsonnet, but probably I should. Superficially it looks like one of the more mature formats, and in many ways it looks similar to RCL. Its has a page comparing itself against other configuration languages.

[^1]: https://ruudvanasseldonk.com/2024/a-reasonable-configuration...


I should have made it clearer, but I wrote my comment in reaction to his mention that he "never properly evaluated Jsonnet".


Kapitan sounds interesting. It definitely looks difficult to figure out. Even your blog post only gives me a hint, leaving me with lots of questions. I am going to o dig deeper.


Kapitan founder here! Please reach out on the kapitan channel of the kubernetes slack, or connect with me on LinkedIn https://www.linkedin.com/in/alledm

I have been working on a couple of video tutorials that could be helpful to get started


I have seen many of these posts now and I have a simple question. Why do we need special configuration languages? Why not just use existing languages?

I understand why data formats like JSON and YAML are valuable, and I understand why it would be valuable to use a programming language to automate the generation of such formatted data. But the niche of the configuration language remains mysterious to me.


I think because there are some desirable traits for config languages that don't exist in general purpose languages.

Not exhaustive list but generally:

Usually constrained to reduce complexity and try to eliminate the need for testing config.

Interopable between many different programming languages.

Readable by programmers working in different languages.

I think this usually makes config languages favour declarative over imperative which usually eliminates most general purpose languages.

Another topic is why do we use configuration at all and what is the difference between code and config.


It also is desirable that you can programmatically process configuration files (example: go through all your docker files to determine which ones use vulnerable images)

That is at odds with using a generic Turing complete language, as figuring out what the configuration is requires running code that may do all kinds of weird things (download data, delete files, etc) (that is similar to the postscript/PDF case. To figure out how many pages a postscript file has, you have to run code)

Now, some will say they trust those writing config files to not do stupid things and only use that power when it is absolutely required, but that’s not an opinion shared by all.


I think NixOS/Nix is a good example of where we need one. Nix is used for a massive amount of configuration which needs to be functional and turing-complete. Nix is pretty nice and compact as a configuration language, but the "programming language" aspect of Nix, IMO is ugly and unreadable, and could have been designed better.


People keep hacking up YAML with templating to produce a configuration language which is a bad solution (see for example https://leebriggs.co.uk/blog/2019/02/07/why-are-we-templatin...). The problem is that the people who are using templated YAML don't really want to learn a full-blown language since the vast majority of them never, ever want to write a compiler or a highly concurrent HTTP server. They definitely don't care about functional programming or want to know what a monad is or want to have to learn about first class functions and closures or async/await. They just want a pretty trivial language that you can learn in a weekend or two. The kind of necessary complexity that you might 'add' to such a language before anything else is making it side-effect free so that you can't alter the running state of the system in the language and the purpose is to render a YAML/JSON/TOML config to feed as declarative config into tools like terraform. That focus is much different from a programming language which gives you the ability to execve() arbitrary commands and comes with concurrency primitives that templated-YAML users just don't care about at all.

When you look at it from the perspective of someone who already knows a computer language well it doesn't seem useful, but the point is to address the users who don't know any computer languages particularly well.

And 24 years ago I thought we needed to just use general purpose computer languages and turn system administrators into programmers (or fire the ones that wouldn't learn) but that has clearly never happened, and we're retaining the less-technical-than-SWE roles who manage infrastructure, although its now DevOps people managing k8s with YAML. If just using a general purpose language met the social needs that we have then it would have happened already. The existence of templated YAML and its overwhelming success proves that there's a need that has to be met. I'm still skeptical that any of these configuration management languages is thinking clearly about what need is driving templated YAML though, but its nice that there's an explosion of them so that hopefully sooner or later one of them will really stick and become popular.


Don't repeat yourself / single-point-of-truth. Config files can be full of repetition, from values, to sections/objects. Defining a value or object once, dramatically reduces search+replace errors for example.


Is that mutually exclusive with using your normal programming language as a config language?


What do you do when your normal programming language stops being your normal programming language (you add a language, switch languages), but you want to preserve details of the configuration? Translate it all into your new language(s)? How do you keep these things synced up? You extract the details into a configuration language/serialization format and then deserialize in every language you work with.

Embedding your configuration in your application language only works well as long as you have one application language. (Or perhaps you use Lua or TCL which are both almost trivially embedded into every other language.)


You can compile your general purpose language DSL to json and then I think this isn’t too much of a problem.

A bigger issue imo is packaging. General purpose languages aren’t typed well-designed for producing single, understandable standalone files like config languages are. Like I think a simple config DSL in TypeScript could potentially be a perfect way to solve this problem, except that no one wants to lug around a package.json, package-lock.json, and node_modules directory just to write a bit of config. Yet the ability to bring in npm modules—especially those containing relevant types—is where a lot of the attraction of using TypeScript comes from.


So turn the general purpose language into a non-general purpose language by creating a compiler for a subset of it. That might be what we'd call a configuration language then since it's no longer actually the original language (in full). Like JSON and others.


It’s not really a subset that I’m talking about—just a DSL that outputs JSON for easy portability. You could still use all the language’s features.


Is that a real problem to be solved? Do you switch languages in an existing project that often? When you do, do you just copy/paste your config files or do we have to rebuild them in whatever config language the new language uses?


Can you use your normal programming language as a config language?

Config is dynamically parsed at runtime. Do you want to bundle a c++ compiler in your code, for example? Config shouldn't be difficult to write, or to parse and process.

Perhaps zig and rust can be used as configuration languages as they both do some interesting things that other languages don't.


I don’t see why you can bundle a YAML parser but not a Javascript parser

> Config shouldn't be difficult to write

Couldn’t agree more! In fact, it doesn’t matter if it’s hard to parse. Because that’s what computers are good for. What a human does is write and read configs. Whatever we can do to make that easier we should do.

Using a known programming language, rather than a dedicated confit language, means the user is already familiar with the language and uses it day to day. That’s easy. What’s hard is learning a totally separate language with its own rules, syntax, tool chain, and capabilities.


There’s no smooth gradient between data format and Turing complete programming language. Let’s say I want environment conditional logic (such as looking up a map of AMIs to AWS regions), but don’t want to allow arbitrary code execution, or I want to embed the parser inside the service that accepts the config. Those are the niches that HCLs fill.

A programming language that allows you to enable language constructs individually might be interesting (imagine being able to easily turn on list comprehensions but not allow imports or to disable mutation).


> There’s no smooth gradient between data format and Turing complete programming language.

S-expressions beg to differ.


Funny you should mention that, I implemented an S-expression based language spelled using JSON for a dynamic configuration use case. It was actually not terrible for a day or two of work, but it was a far cry from a general purpose production grade configuration language due to lack of tooling, intellisense, and a small stdlib. Also, many people found the syntax foreign and confusing; not a deal breaker but just another hurdle.

FWIW I agree with you, but driving adoption would be hard.


I'm guessing here, but maybe you see these configuration languages as templates? However a configuration language is much more than that. For example, jsonnet lets you combine objects and override deeply nested fields [1]. I'm not sure how practical it would be to achieve the same result with a general purpose language.

1: https://jsonnet.org/learning/tutorial.html#oo


> Why do we need special configuration languages? Why not just use existing languages?

Sure, or you could turn around and ask:

Why do we need languages at all? Surely we can express everything with assembly?

Or, shouldn’t one language work for all cases? Why not just use C everywhere?

And the answer becomes obvious: some languages are better at certain tasks than others! Then, a “configuration language” is just going to be a programming language that is best suited to generating configuration.

The same way that C or Rust have different use cases than C# and Java.


Okay, then in your terms my question would be "why is the best language for configuration not an existing language?"


I think the fact that most configuration languages tend towards being declarative, while many popular languages are imperative would tend towards an answer.

But, like, the answer is because "people had an itch that they couldn't scratch".


> It’s 2024, so RCL has some features that you might expect from a “modern” language: trailing commas

I have the opposite opinion here. Commas should be like semi-colons: only required if you want multiple statements on a line.

Eg

    fruit = [
        “apples”
        “oranges”
        “bananas”
    ]
No commas yet still extremely explicit


That’s not really the opposite opinion… the opposite is not allowing trailing commas (like JSON) so if you remove the last item from a list, you need to adjust the line above as well. And when you add something to a list, you need to know if something will come after it, even though you don’t necessarily know.

Commas as separator is pretty much orthogonal, and… well, better overall.


I got what they were saying. My point is that if you have a line feed then a comma is an entirely redundant piece of syntax. Commas should only be needed if you are placing multiple statements on a single line. Just like how several programming languages treat semi-colons (eg Go, bash, JavaScript, etc). So mandating them is the opposite of that modern languages should be insisting upon.

In Murex (my own language) both commas and semi-colons can be dropped if the statement or expression is terminated by a line feed.

    list = %[
        chair
        table
        “bed-side table”
    ]
YAML, for all of its warts, also behaves similarly too.

It’s a much better way to handle lists and objects because the comma only exists for the same reason the semi-colon does in C-like syntax: as a parser hint. But if you’ve got a new line then that hint is completely redundant.


Would your language also accept this?

  list = %[
    chair, table
  ]
And this?

  list = %[
    chair, table,
  ]


Yes it would:

    » %[ chair, table ]
    [
        "chair",
        "table"
    ]

    » %[ chair, table, ]
    [
        "chair",
        "table"
    ]
Here you can see that internally it understands you're writing a JSON object however it relaxes the semantics a little (without going full on YAML) because chasing that erroneous comma is probably the least fun part of programming.


So, accepting the spurious comma at the end of the list and not requiring commas when there's a newline can coexist. I would say they are tho orthogonal properties. You can support the latter without opposing the former.


Sure, but that’s not what the linked article discusses. It specifically states that commas are mandatory on each line. Not that the rule for commas is relaxed. And what I’m arguing is that requiring commas on each line is pushing the syntax in the wrong direction.


Clearly, whether the language considers commas necessary or optional should be in a config file.


It's only redundant in languages that make it redundant, like you mentioned C at the end - it's the new line that's redundant there (or only for readability), it's the semicolon with the 'end' semantics.


> It's only redundant in languages that make it redundant

Yeah, I did say "several" not "all".

There is very little benefit in new languages making terminator tokens necessary when followed by LF. We are long past the point of needing to optimise language syntax for parser efficiency.


In this specific case it's not redundant in C - without the commas, the strings would be merged together.


Exactly, that's what I said? 'In this specific case' though? It would be news to me (not that I've used it since university!) that they're ever non-semantic?


Why do you need a comma on a single line in 2024, you already have 2+1 extra symbols/value to separate: quotes and space


Well in my own language commas are entirely optional. They actually only exist for JSON compatibility (ie pasting JSON blobs into the language rather than transcribing the data). But I will concede that having a comma separated list on a single line does at least improve readability for some people. In that particular scenario we are drifting into the laws of diminishing returns and thus things become a lot more subjective.

However having a comma terminator when followed by a LF doesn’t. The whitespace is larger and more visible than the comma. In that scenario the comma doesn’t add enough visual distinction to add value to the syntax.


I don't mind some people thinking 1 is more readable than 2

    1fruit = ["apples","oranges","bananas"]

    2fruit = [apples       oranges         bananas]
was only talking about commas being required like semicolons since multiple statements use space within, so need something else unlike lists, so it's not comparable


commas are only really required if there are binary operators: [1, 2 + 3, 4]

otherwise you need parenthesis around expressions with operators: [1 (2 + 3) 4]

unary operators don't create a need for commas, nor the f(x) function call syntax with a single parameter: [1 f(x) ~2 ~f(x)]

now, mild redundancy in syntax is a good thing. it enables detecting whether some code is mistaken (perhaps because a botched copy and paste, for example), giving syntax errors


it's needed if you use space within, otherwise

    [1    2+3     4]
doesn't need a comma, and the rule "if you value space in values, quote/bracket those" is a fine simple rule

Redundancy is also a bad thing, it's a tradeoff, so the proper approach is to allow enabling it for those and in those case when the value of the good part is higher


That example syntax has so many potential footguns though.

I’m not someone that subscribes to the notion that saving keystrokes increases productivity (eg function key words being “fn”). I only think it’s worthwhile removing tokens if they add to the cognitive overhead. In the case of trailing semicolons and commas, it’s something you need to remember to add. So why not make it optional?

However using whitespace as a delimiter and having expressions require zero spaces between tokens is adding to a developer’s cognitive overhead. As well as inviting hard to find bugs.


In the case of non-trailing commas it's exactly the same - something you need to remember to add. And remember to (re)move when you move arguments around, that's also buggy cognitive overhead.

Expressions don't require zero spaces, use (), this was just an illustration that there is no need for a comma.


> In the case of non-trailing commas it's exactly the same - something you need to remember to add. And remember to (re)move when you move arguments around, that's also buggy cognitive overhead.

That’s exactly why I’m advocating that trailing commas should be optional ;)

> Expressions don't require zero spaces, use (), this was just an illustration that there is no need for a comma.

You’re example specifically stated using zero spaces for expressions though. So that’s what I’m replying to.

I have no issues with parentheses being used to denote an expression. In fact that largely mirrors how my own language works.


> I was struggling with that day was to define six cloud storage buckets in Terraform...The kind of thing you’d do with a two-line nested loop in any general-purpose language

I understand this is just an example, but FYI the modern solution is to use CDKTF rather than HCL for Terraform.

That allows you to choose your favorite general purpose lang: Python, TypeScript, Go, Java, C#.


I use Terraform because of HCL, the absolute best thing about it is the declarative config; if someone insists on using their favourite general purpose language: 1) they're wrong; 2) there's plenty of other options for that and I'm not interested in fighting in Terraform's corner knowing they won't use it for the most fundamental reason it's good.


You do know all that CDKTF does is compile programming language code to JSON (1:1 with Terraform's HCL)?

CDKTF is not accessing the Terraform state directly, it's allowing you to express your configuration that you would normally write in Terraform in a programatic way.


So, your reply to Ruud is "you're holding it wrong" or "it actually fine you just think it isn't"?


No, principally because although he writes:

> Aside from Nix, Python, and HCL, which I’ve already discussed extensively,

He doesn't seem to have much more to say about the latter than that.

'My reply', bluntly, is something like 'maybe it has shortcomings, you haven't addressed them; its most fundamental key advantage is not addressed at all, about it or any other language, and is not a feature of your new one'.


HCL has its quirks - bit I'm not entirely sure they are without merit in this case - HCL describes resources - and it's important that when your iteration count goes down, resources are deleted. Some discussion in sections 6, 7 and 8 (linked):

https://blog.boltops.com/2020/10/08/terraform-hcl-for-in-and...


> it's important that when your iteration count goes down, resources are deleted

Correct, but that has nothing to do with HCL and everything to do with stored state.


Well, ultimately it comes down to "current"(stored) state vs "desired" state, as mentioned in part 6:

https://blog.boltops.com/2020/10/06/terraform-hcl-nested-loo...

Quote:

Consideration: Updates with Removal

There’s a subtle but important consideration with the current code. It happens when the code gets updated, particularly when previously added elements are removed.

For example, let’s say we first use the code above and run a terraform apply. That creates security groups with rules. Then we delete the rules from the code. Running terraform apply again will not remove the rules.

This is because when there’s an empty List, the for_each loop never iterates. If you wish for the security group rules to maintain its current state set outside of Terraform, you may want this behavior. However, this is probably unexpected and undesirable behavior.

If you want to have Terraform remove all the security group rules, then ingress needs to be assigned directly with a List. We’ll cover how to do that shortly.


You are likely already aware of this, but for other readers, you actually have to use Pulumi if you want native language feature support.

I agree CDK for TF can be better than HCL, but it is still more like a template preprocessing utility for Terraform and thus still carries the limitations of Terraform.


> but FYI the modern solution is to use CDKTF rather than HCL for Terraform.

That's an odd take. Are you saying that because it's newer? I would push something like Crossplane as more "modern" in that it solves the critical issue of Terraform not having any sort of reconciliation loop.


Because it is a (recent) solution to the problem ("how to easily loop, etc in Terraform").

There are Terraform alternatives...Cloudformation, Pulumi, Crossplane...but that's a separate matter.


Gotcha. I guess I see it more as an alternative rather than a modern evolution. I don't particularly like the direction of these SDKs, but I see an increasing demand for them.


A reasonable configuration language blog post. Explains the motivation, provides some meaningful examples and a happy accident, and concludes with an overview of prominent alternatives.

Worth a read, and worth trying out sometime, if nothing else because I too don't use jq often enough to remember more complex usages.


Although I'm late to the party, I'm surprised by only 4 mentions of "Lua" in all of comments.

At its basis, a valid Lua file can consist of one table. Then the data description will be very similar to how people operate with JSON files. At any point the user is able to write code or use other standard library functions... which is pretty limited, but at the same time it's easy to sandbox plain Lua away from "dangerous" functions such as os.exec or the debug library.

There are only a few caveats:

- Lua uses a 64-bit IEEE float format with some truncated precision

- The standard interpreter doesn't handle very huge tables for parsing

- Treating .lua as code, not data, you'd likely prepend "return " to your table to just receive the table object from the config file

- Unkeyed arrays start from 1, though storing at index 0 is possible

- Cascading of data definitions is possible, like fall-through from one semi-filled table to the default table. Use the __index metamethod

Personally I find Lua's table syntax vastly more readable than JSON. Although the language naturally has comments, the parser ignores them, if that's something you want. But the same workaround works: just store the comment in some keyname_comment value.

Apart from sandboxing you'd only need to limit the script execution time to stop attempts like "while true; end"

PS: Lua started as a configuration language. Wiki:

> Lua's predecessors were the data-description/configuration languages SOL (Simple Object Language) and DEL (data-entry language).


For templating/generating YAML it's worth a look at https://yglu.io/ and ingy's latest piece of madness, https://yamlscript.org/

I make no claim that anybody who does look will like either of them, but I do claim they're worth a look even if it turns out that you don't.


I feel like just reading about these took years off my life.


Great start! I'd like this even more if you dropped the python-style f-strings for a JavaScript or HCL flavor instead.

You noted at the bottom that RCL type system may take the path of TypeScript. This is brilliant idea because structural type systems match perfectly with a configuration language.


IMO Nickel looks better in this regard as it allows you to use typing only when it helps. Plus, it has contracts that complement the type system.


I want to showcase one way to do the example jq/non-jq query using yamlpath:

  yaml-get -p '.tags[has_child(amd)][parent()].name' machines.json
That can replace

  rcl query --output=raw machines.json '[
    for m in input:
    if m.get("tags", []).contains("amd"):
    m.name
  ]'


I quite like what has been come up with here. In particular, I understand exactly how HCL would drive someone down this path, as it is infuriating to try to get that language to compute what you want computed.

I think at the end of the day, k8s YAML is focused on the datatype, i.e., what would be the result of your `rcl evaluate`. I do think this would be a much saner path than what Helm provides, though, and Helm's textual templating, as opposed to understanding values and building the actual datastructures, and then converting that to a format like JSON or YAML … is the wrong path. (I.e., I can see RCL being a possible replacement for Helm.)

> The language is a superset of json.

I'm going to introduce what I think is pretty much a universal law: languages claiming to be supersets of other languages are not supersets.

In the case of JSON, there's basically one counter-example that breaks all alleged supersets:

  "\ud83d\udca9"
And if we try it:

  » printf '"\\ud83d\\udca9"' | cargo r -- evaluate
      Finished dev [unoptimized + debuginfo] target(s) in 0.01s
       Running `target/debug/rcl evaluate`
  stdin:1:2
    ╷
  1 │ "\ud83d\udca9"
    ╵  ^~~~~~
  Error: Invalid escape sequence: not a Unicode scalar value.

  Help: For code points beyond U+FFFF, use '\u{...}' instead of a surrogate pair.
Now! This to me is not a bug in RCL: this particular facet of JSON is utter crazy town, and I would strongly encourage you to not adopt it. (Since down this road lies madness, like unpaired surrogates. JSON's grammar & standard is sloppy here, and JSON/JS's syntax of using the UTF-16 encoding, and not just the scalar value … it's the part of JavaScript that is just what JS is, but is the part that we shouldn't be copying.)

It is much saner to just have a flag/way to indicate "my input is JSON" and then to just parse it via an actual JSON parser. Then let RCL evolve on its own merits. If there's a lot of happy overlap, and most JSON documents are blissfully polyglots with RCL, that's fine too / a happy little accident. (But the option is important if you want to apply it somewhere programmatically, where the inputs are JSON.)

YAML breaks the same way, too: it too is a "superset" of JSON that isn't.


I've seen people criticizing Go for not being correct enough, e.g. for its way of handling dates among other things. But I think Go has a saner balance of correctness vs. simplicity. I don't like being forced to do stuff correctly all the time.

I somewhat agree with your point on having an explicit way to indicate JSON input. OTOH, in practice I'd rather prefer having a section about limitations in the docs.


Have you ever run into a case where this distinction matters in practice? If it's a superset of all JSON that actually exists in the wild, that feels functionally the same as being a superset of JSON, even if the spec technically allows edge cases it can't handle.


I’m curious to know what this guy would think about pkl.


how is this different from the other one posted yesterday https://news.ycombinator.com/item?id=39232976


This looks very restrained. Well done!


> The kind of thing you’d do with a two-line nested loop ... > that is was far simpler to just copy-paste the config six times.

And that's the right way for infrastructure and build configs. You do want over-verbose, no-logic code in the configs. Unlike app code, you rarely need to change it, and readability is even more important. 6 duplicates is not too many yet for infra code.

In terms of copy-paste tolerance, I'd rank from least to most: app code, tests code, infra configs, and build pipeline configs.


And yet, sometimes you do want to reuse code in infra configs and build pipelines. I have definitely wanted to reuse code in hcl, to the point that I wrote a Ruby script to generate it because that was easier than trying to do it in hcl.

Clearly I'm not the only one. Gitlab even sees the use enough to support syntax that is invalid yaml: https://docs.gitlab.com/ee/ci/yaml/yaml_optimization.html#re...

This, of course, causes yaml-language-server to complain. What I wouldn't give to have a language where I could, say "goto definition" for things like this rather than get LSP errors.


I agree with you in a lot of situations but I think there are situations where the intention you want to communicate is "these 6 situations are the same, except for this variation, because X". If you duplicate, it can be hard to capture in a way that is as robust. And then you risk logical errors in your configuration, which can be very damaging.


The problem isnt "configuration"

Yaml, JSON, xml, text files... all work great for configuration.

But the assumption is that your configuring a piece of software on an already existing system.

A config file as means of setting up a system, installing software, and establishing how its going to run is exceedingly stupid.

Write your app so it can run on bare metal, install with apt, yum or your tool(s) of choice for your org. Build it so it scales diagonally, and works against spot instances where you can (because sometimes more or less cores are cheaper). Dont even get me started on the nonsense of everyone and their ideas for "secrets management", it makes me miss LDAP.


> Write your app so it can run on bare metal, install with apt, yum or your tool(s) of choice for your org.

If it's running on bare metal, why are you using apt or yum? Those are parts of an OS you wouldn't have if you're running your program without an OS.


Are you suggesting that all configuration should be dynamic and multi-tenant (for the lack of a better word), not happen at startup?


Scripting a config file. Templates for config. We did this already, xml, XSD, XSLT, DOM...

It sucked.

Config should be easy and flat and simple. IF it isnt that source it from code, or from a service... Most of the nonsense of config madness is a byproduct of, or hidden in, containers. If we were writing installable software much of that would go away...


What’s this magical config service do? Take data in from a request and return transformed data back in response to configure the caller? What is the essential difference between this config service and existing config languages?


Wow! Another configuration language post. 2nd day in a row. And that too on the front page. It's 2024 and the software industry has not solved the problem of configuration yet. Umpteen solutions every year - languages, formats, etc - all for just doing configuration. Just wow! It amazes me. It would be so much better if all the great minds focused on real issues than keep spinning on the hamster wheel of config file formats and languages.


I spent my PhD working on one of the most pressing problems facing big cloud companies at the time: downtime. The big problem being faced was… configuration.


What in your humble opinion is a real issue that should be addressed instead?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: