A type system for RCL, part 2: The type system

kccqzy · 2024-07-18T20:21:41 1721334101

The author's generalized subtype check is interesting: what the author defines as "inconclusive" is clearly a type error in more traditional statically typed languages. (Traditionally the type checker is supposed to guarantee progress and preservation, so leaving something inconclusive to the runtime isn't acceptable.) And I guess the kind of differentiation the author makes between "inconclusive" and "static error" is so unusual that the normal classification of covariance and contravariance break. This reminds me of Haskell, where type variables may have user-provided annotation of whether they are representational, nominal, or phantom. Such classification affects the ability for values to be treated at runtime as a different type (using Data.Coerce). Maybe the author can come up with a similar system for classifying type variables for things like List.

However for a configuration language that is type checked and immediately run, there's not much difference between runtime type checks and pre-run static type checks. Given this escape hatch, I think the author is right to simply defer everything that cannot be checked statically into runtime checks.

cogman10 · 2024-07-18T20:30:18 1721334618

Can't say I really dig using runtime type checks as a get out of jail system for harder to verify problems. The problem I have with that methodology is it creates errors that could have otherwise been caught statically.

Consider, for example, the given example

    let a: Int = 1
    let b: Any = a
    let c: Int = b

In this simple case, it seems like everything works fine. But what if instead of `b = a` we had `b = foo()`? Now all the sudden discovering whether or not `b` can be safely assigned to an `Int` depends entirely on what `Foo` returns. If foo returns `Any` then you can have a mess where some refactoring ends up having a hard to detect bug.

Consider, for example if foo is changed from

    foo() { return 1 }

to

    foo() { if(bar) return "a" else return 1 }

That could be fine given foo is already sending back `Any` but now when `bar` is true that string returned could cause a runtime exception far from where `foo` was defined.

To me, this seems like giving up a large portion of the benefits of having typing in the first place.

Dart 1.0 had a similar system with "optional typing" which was ultimately axed because of issues like these [1]. In my view, this sort of type system is `Any` creating optional typing.

[1] https://news.dartlang.org/2017/06/a-stronger-dart-for-everyo...

adamwk · 2024-07-18T23:45:44 1721346344

I agree. It seems like you're back to the typesystems of c or java 1.6. The very typesystems that drove people to ruby and friends in the previous decade.

lmm · 2024-07-19T00:18:47 1721348327

WTF? No. Those have the almost opposite problem: they're famously inflexible, meaning you end up with lots of repetition, or else people resort to reflection or casting.

zupa-hu · 2024-07-19T06:34:00 1721370840

I enjoyed the thoughtful article.

It's interesting that type checking happens at runtime so there is no compile time, only runtime. It seems this would make it acceptable. But I wonder if the assumption will always hold.

The assumption:

> The program doesn’t even have any inputs: all parameters are “hard-coded” into it. [1]

Will that always hold? What about environment variables? If the language ever needs to handle parameters, this assumption breaks and thus compile time and runtime will be separated.

Other languages like Go would force you to insert type assertion which is the same thing just explicit. I forces you to "approve" what's happening, and that's a good thing.

There is the saying that every configuration language will eventually evolve into a full blown programming language. I guess it's all about the real world proving the underlying assumptions to be incorrect.

[1] - https://ruudvanasseldonk.com/2024/a-type-system-for-rcl-part...

ruuda · 2024-07-19T20:49:21 1721422161

> Will that always hold?

It’s hard to be sure about whether anything will always be true, but I intend to not expose environment variables, CLI arguments, or other impurities to RCL.

> What about environment variables? If the language ever needs to handle parameters, this assumption breaks and thus compile time and runtime will be separated.

This is more of an argument about the definition of “input”. In RCL you can import other RCL/json files (as structured data) or any UTF-8 file (as a string). If the evaluation of the program depended on environment variables (or if you could pass input values on the CLI), that would be very annoying for reproducibility. You’d have to specify out of band how to invoke `rcl`, and how to pass the right values. Instead, we can simply write those values to a file and import them. For my definition of “program”, by doing that, they have become part of the program, so we again have a program with no _external_ parameters.

Regardless of whether you call them inputs, the person who sets these values also immediately evaluates the program, so if the combination of parameters causes an error somewhere, the programmer can fix the input or the program. That’s different from a daemon, or program deployed to users, where the programmer does not yet know at development time which values can occur at runtime.

I can think of one case where this does not hold. What if you want to develop a _library_ of RCL code, with functions where you don’t know how they will be called? I don’t think that RCL programs should grow so complex that this is needed, but maybe I am too naive or nearsighted about that, and it will happen anyway. (Sharing schemas may be useful, but they only define types, not functions.) Even in that case, a configuration language is forgiving, because when the program is evaluated there is a programmer running it, even if it is not the person who wrote the function where the error originates.

(What if you build some system that autonomously generates json files and calls an RCL program that imports them, without a human watching? Sure you can do that, but then if it fails at runtime, I think that’s a case of the Baton Roue meme.)

I guess the summary of this is: for some languages, yes the type system should be very strict, and you want to know that when you write a function, it can’t fail at runtime. But for a configuration language, that is not a problem that the type system needs to solve. I wrote more about this in part 1: https://ruudvanasseldonk.com/2024/a-type-system-for-rcl-part....

kccqzy · 2024-07-19T21:25:01 1721424301

> What if you want to develop a _library_ of RCL code, with functions where you don’t know how they will be called? I don’t think that RCL programs should grow so complex that this is needed, but maybe I am too naive or nearsighted about that, and it will happen anyway.

This is definitely happening. A module system is essential for configuration languages. I worked with multiple in-house configuration languages and all of them end up needing that.

zupa-hu · 2024-07-19T22:56:52 1721429812

Thanks for the long answer. Okay, it's good you have thought it through. Will be interesting to see how it evolves.