More

akiarie · 2024-06-24T09:02:17 1719219737

Not even remotely.

  if (x > 0) {
    /* code */
  } else {
    /* more code */
  }

Branch coverage just means that all the lines above are executed in some test. That's fundamentally different from dealing with all the different values that `x` could take on at run time.

ffsm8 · 2024-06-24T09:37:36 1719221856

You're missing the point.

Given your example no software can ever handle this equation given infinite numbers, as all CPUs have a maximum number they can compare.

Just like you cannot prove that a bridge will be able to handle a given load. Engineering fundamentally can only make assurances within expectations. Your tests define these expectations, and by covering 100% branching coverage you've exhaustively explored all valid states your software can have.

Do note I'm talking about actual 100% branching coverage here, not just 100% coverage of your own code, ignoring libraries and runtimes you've imported. That's way harder then you seem to think and the reason why sqlite (the linked page) has so much more testing code then application code

akiarie · 2024-05-25T13:43:18 1716644598

0db is an interface for Xr0, so it's the same thing essentially.

Alifatisk · 2024-05-25T15:35:45 1716651345

Yeah but I think the difference is that Xr0 is the checker you run your code through and it spits out the result.

akiarie · 2024-04-06T04:13:09 1712376789

It cannot in general deduce the abstract. Doing so would require solving the halting problem.

muldvarp · 2024-04-07T10:46:04 1712486764

It also can't in general deduce whether the abstract and the implementation match, as doing so would require solving the equivalence problem (which is undecidable for anything as or more powerful than pushdown automata).

akiarie · 2024-04-08T12:09:23 1712578163

I'm not very familiar with the equivalence problem, but at least for Xr0 abstracts (what we call the annotations attached to functions & loops) there is an extremely strong structural similarity between the implementation and the annotation. So it's not at all clear that we'll run into this.

muldvarp · 2024-04-08T12:33:36 1712579616

> there is an extremely strong structural similarity between the implementation and the annotation.

If the annotation is basically just a copy of the function body why have any annotation at all?

> So it's not at all clear that we'll run into this.

It might not be clear to you, but if you want to allow any annotation with the same semantics as the body of the function you will run into this.

akiarie · 2024-04-09T08:57:05 1712653025

> If the annotation is basically just a copy of the function body why have any annotation at all?

This is a profound question. Xr0 annotations seem to be copies of the function bodies, but they aren't actually. What they are is a description of the function's safety semantics, the side effects and circumstances under which the function may be relied upon to exhibit well-defined behaviour (under the C Standard).

They only seem to be copies of the bodies because we're dealing with simple examples in the post here and in the tutorial, but we have a much more significant program ready (documenting) that we'll be sharing soon that shows the difference more clearly.

Xr0 is built in explicit reliance upon the powers of programmers to structure code in such a way that the annotations become simpler and simpler as you move up layers of function abstraction. As a simple example, think about what happens when you allocate memory and free it within a function, e.g.

        #include <stdlib.h>

        void
        foo() ~ [ return malloc(1); ]
        {
                return malloc(1);
        }

        void
        bar()
        {
                void *p = foo();
                free(p);
        }

        void
        baz()
        {
                bar(); /* completely safe */
        }

Note that `foo`'s annotation is like a copy of its body, but `bar`'s annotation is very simple, even though it interacts with `foo`, which is only safe if the pointer it returns is handled. So from `baz`'s perspective there are no safety semantics it has to worry about when it interacts with `bar`. The complexity that could lead to safety bugs in `foo` has been completely handled in `bar`.

We call this simplifying phenomenon denouement, and well-designed programs exhibit it very strongly. Our belief that programmers can design constructs that tend towards denouement as one moves down the call stack towards `main` is one of the basic reasons we're working on Xr0.

> It might not be clear to you, but if you want to allow any annotation with the same semantics as the body of the function you will run into this.

Yes, but we don't. Only well-defined behaviour according to the C Standard. This is the goal for Xr0 v1.0.0. After that the sky is the limit.

muldvarp · 2024-04-09T09:54:20 1712656460

> Xr0 annotations seem to be copies of the function bodies, but they aren't actually.

You're constantly dodging the question. If annotations aren't copies of function bodies (which is really the only sensible choice), you need to deduce whether a function body matches the annotations. "Structural similarity" between C and your annotation syntax won't make this easier. In fact, it is impossible to deduce whether two C functions are semantically equivalent even though C is extremely structurally similar to C (you could even argue that C and C are structurally identical).

> Our belief that programmers can design constructs that tend towards denouement as one moves down the call stack towards `main` is one of the basic reasons we're working on Xr0.

So your main criticisms of Rust will mostly also apply to Xr0: You need to rewrite existing C code to make annotations practical (and at that point you might as well reimplement it in Rust) and Xr0 will limit the constructs you will use in your code, because annotations will be impractical for code that isn't written for "denouement".

akiarie · 2024-04-09T10:16:42 1712657802

> If annotations aren't copies of function bodies (which is really the only sensible choice), you need to deduce whether a function body matches the annotations. "Structural similarity" between C and your annotation syntax won't make this easier. In fact, it is impossible to deduce whether two C functions are semantically equivalent even though C is extremely structurally similar to C (you could even argue that C and C are structurally identical).

Maybe I don't understand your point. I am not denying that we're deducing that the body matches the annotations. I'm simply saying that for the restricted case of safety semantics alone, this deduction can be done. Yes, it is impossible to deduce semantic equivalence of arbitrary functions. The question is whether it is possible to deduce that annotations capture the safety semantics of the body. If you have a concrete example it would be helpful here – but for the restricted concerns of safety, not for arbitrary constructs.

> So your main criticisms of Rust will mostly also apply to Xr0: You need to rewrite existing C code to make annotations practical (and at that point you might as well reimplement it in Rust) and Xr0 will limit the constructs you will use in your code, because annotations will be impractical for code that isn't written for "denouement".

Your quote omits the first sentence of the paragraph, which states that "well-designed programs exhibit [denouement] very strongly". Denouement in the sense we're referring to is an absolute theoretical necessity for any safe program, because at some level (of functional abstraction) the safety concerns must be handled, otherwise the program would have a safety vulnerability.

Xr0 definitely limits constructs, but our claim is that the limitation we're imposing is one that reflects the structure of all safe programs. The same cannot be said about Rust's ownership semantics, which limit an enormous number of simple, safe constructs. So the C programs to which one would be adding Xr0 annotations wouldn't need to be rewritten unless a bug has been discovered.

muldvarp · 2024-04-09T11:14:42 1712661282

> I'm simply saying that for the restricted case of safety semantics alone, this deduction can be done.

And I'm saying it can't be done, at least not in a fundamentally less tedious and hard way than Frama-C.

> Yes, it is impossible to deduce semantic equivalence of arbitrary functions. The question is whether it is possible to deduce that annotations capture the safety semantics of the body.

It isn't in general.

> If you have a concrete example it would be helpful here – but for the restricted concerns of safety, not for arbitrary constructs.

Safety is not a "restricted concern": You can for every property P easily construct a function that is safe if and only if property P holds. I'll be using Python because this example (which I like) requires arbitrary size integers. You could obviously also implement this in C, you'd just need to implement arbitrary size integers (or use a library that implements them):

    def collatz(x):
        if x % 2 == 0:
            return x // 2
        return x * 3 + 1

    def cycle(f, x):
        tortoise = f(x)
        hare = f(f(x))

        while tortoise != hare:
            tortoise = f(tortoise)
            hare = f(f(hare))

        return hare

    buffer = [0, 0, 0, 0, 0]
    x = int(input())
    if x >= 1:
        collatz_cycle_element = cycle(collatz, x)
        print(buffer[collatz_cycle_element]) # this is safe (or is it?)

This takes as an input an arbitrary positive integer x, searches for a cycle in the Collatz sequence beginning with that number and returns an arbitrary element of that cycle. It is conjectured that for every positive integer this cycle will be 4 -> 2 -> 1 -> 4 -> ... and thus this program is safe if and only if the Collatz conjecture holds.

Another example would be a C program that generates random planar graphs, computes their chromatic numbers and then collects statistics about them in an `int statistics[5]`:

    #include <stdio.h>

    void main() {
        int statistics[5] = {0};

        for(int i = 0; i < 1000; i++) {
            struct graph g = generate_random_planar_graph();
            int chromatic_number = compute_chromatic_number(g);
            statistics[chromatic_number] += 1;
        }
        
        printf("%i, %i, %i, %i", statistics[1], statistics[2], statistics[3], statistics[4]);
    }

The safety of this program requires you to prove that `generate_random_planar_graph` always returns a planar graph, that `compute_chromatic_number` correctly identifies the chromatic number and that the chromatic number of every planar graph is less than 5.

akiarie · 2024-04-09T11:55:05 1712663705

Thought provoking stuff! I really appreciate how much effort you've put into this.

However, the main reason why this argument is flawed is it omits the heart of the matter: the annotations. Xr0 empowers programmers to propagate safety semantics. A program that is only safe if the Collatz conjecture holds is (surprise) only safe if the Collatz conjecture holds. So in Xr0 the only requirement we would impose is that this program be augmented with an annotation that communicates that it is only safe if the Collatz conjecture is true.

So the flaw in the reasoning is we haven't claimed that Xr0 can prove arbitrary programs are safe. We've claimed that Xr0 can prove the correspondence between the safety semantics denoted in an annotation and a function body. Above there are no annotations given which would specify "this program is safe only if the Collatz conjecture is true". It shouldn't be hard to prove the correspondence between such an annotation and the program you've written, e.g.:

    def main(): ~ [
        buffer = [0, 0, 0, 0, 0]
        x = int(input())
        if x >= 1:
            collatz_cycle_element = cycle(collatz, x)
            print(buffer[collatz_cycle_element])
    ]

        buffer = [0, 0, 0, 0, 0]
        x = int(input())
        if x >= 1:
            collatz_cycle_element = cycle(collatz, x)
            print(buffer[collatz_cycle_element])

It's the principle of propagating the safety-determining factors of the function that we're stressing, not some kind of almighty power to judge that arbitrary constructs are safe or not.

muldvarp · 2024-04-09T12:03:59 1712664239

> It's the principle of propagating the safety-determining factors of the function that we're stressing

That's the main function. It's the function that gets called when the program starts. Where do you want to propagate the "safety-determining factors" to?

If I'm just supposed to read the annotations on the `main` function (and those annotations can basically just be a copy of it's body), then why do I need the annotations at all? I could also read the program to determine whether it's safe.

akiarie · 2024-04-09T12:20:32 1712665232

It's because we're dealing with an extreme example in which a program's safety depends on the resolution of an open problem.

If the program's safety depends on the semantics of C and how they've been used in a function, it will be possible to deal with all the conditions upon which a program could be unsafe (in the sense of the standard safety vulnerabilities – like those listed here [0]). Doing so would lead to denouement, so that the annotation to the main function would be empty (meaning none of those bugs can occur).

Programs that might be unsafe should not be verifiable.

[0]: https://alexgaynor.net/2020/may/27/science-on-memory-unsafet...

muldvarp · 2024-04-09T12:45:24 1712666724

> It's because we're dealing with an extreme example in which a program's safety depends on the resolution of an open problem.

It really isn't. The second example is what a reasonable C programmer would produce if you asked them to collect statistics about the chromatic number of random planar graphs. It also isn't based on any open problem. It's overall a very reasonable C program and a small one at that. It's safety also "depends on the semantics of C and how they've been used in a function", specifically on the fact that accessing an array is safe if and only if the index is in bounds. Out of bounds accesses to memory are probably the most common critical vulnerability in C programs, so it is essential that Xr0 prevents them.

I would love to see how you would write this reasonable program in a way that leads to "denouement".

> Programs that might be unsafe should not be verifiable.

Funnily enough, that's exactly what you criticize Rust for. It's not at all clear that writing programs with "denouement" is any less limiting than Rust. In fact, Frama-C (where you also try to write your code in such a way that the annotations become simpler) feels more limiting than Rust.

akiarie · 2024-04-09T13:46:15 1712670375

> It really isn't. The second example is ...

So will you admit that the first example has been sufficiently addressed? Because I was commenting on the problem involving the Collatz conjecture.

> It's safety also "depends on the semantics of C and how they've been used in a function", specifically on the fact that accessing an array is safe if and only if the index is in bounds. Out of bounds accesses to memory are probably the most common critical vulnerability in C programs, so it is essential that Xr0 prevents them.

True. As you said earlier:

> The safety of this program requires you to prove that `generate_random_planar_graph` always returns a planar graph, that `compute_chromatic_number` correctly identifies the chromatic number and that the chromatic number of every planar graph is less than 5.

This can obviously only be proven if we have the bodies of `generate_random_planar_graph` and `compute_chromatic_number`. Please provide these, and then I can attempt to answer. Because our whole point in Xr0 is that safety comes down to formalising interfaces – without interfaces for these functions we cannot investigate the safety of `main`.

muldvarp · 2024-04-09T14:05:20 1712671520

> So will you admit that the first example has been sufficiently addressed? Because I was commenting on the problem involving the Collatz conjecture.

In that this specific program is obviously designed to be extremely hard to prove correct? Yeah. In that most real programs will be easier to prove correct automatically? No.

> This can obviously only be proven if we have the bodies of `generate_random_planar_graph` and `compute_chromatic_number`. Please provide these, and then I can attempt to answer. Because our whole point in Xr0 is that safety comes down to formalising interfaces – without interfaces for these functions we cannot investigate the safety of `main`.

What would be the point of that exactly? We both know that those functions would contain loops and as such Xr0 would be completely unable to verify them. We also know that Xr0 isn't actually able to state that `generate_random_planar_graph` generates a planar graph (other than completely repeating the body in the annotations, I guess). We also both know that Xr0 would not be able to deduce that the chromatic number of planar graphs is less than five.

I already provided two examples of programs the safety of which Xr0 will not be able to verify. I think it's your turn to produce a non-trivial C program that Xr0 will be able to handle.

This discussion really isn't all that enlightening, sadly. Xr0 can't currently handle loops and you haven't produced a single technical argument as to why I should expect Xr0 to be able to handle non-trivial loops in the future.

You also haven't produced a single technical argument as to why Xr0 should succeed where Frama-C failed. The argument that Frama-C is more general while Xr0 is specialized on safety is refuted easily as it is easy to come up with C programs the safety of which depends on arbitrary properties (and I would argue that many if not most real world C programs fall into this category).

akiarie · 2024-04-09T10:20:38 1712658038

One final point – I cannot emphasise strongly enough that we aren't making this for Rust programmers. There's nothing wrong with loving a programming language that has changed the world and what is possible in systems programming. Rust is an advance for systems programming, and it's made incredible strides.

But some programmers (like myself) love working in C, and do not like Rust's restrictions. For such programmers even if the question was one of re-implementation (which we don't think it is) Rust would remain undesirable.

akiarie · 2024-04-06T04:11:55 1712376715

"Simpler" is the word that we care about. Although Rust is an advance on C in many significant ways, what it loses is the simplicity. One of our goals is to offer three same advances (and more) but without compromising on simplicity.

akiarie · 2024-04-06T04:09:04 1712376544

Interesting points. Because it's easy to do the type checking at compile time for function prototypes (though we haven't implemented it yet) we haven't felt under pressure to use the `func(void)` notation, which has always felt somewhat uncouth to me.

The verbosity of the annotations is the most jarring thing at the beginning. However, do you find any of them unclear? Because we're optimising for similarity to C before terseness.

moody__ · 2024-04-06T15:54:29 1712418869

I don't find them unclear at all, after reading the introduction I felt pretty comfortable with everything that was going on in the examples. I think this type of approach you're going for can work depending on how it grows to cover more complex cases like loops. I'll be keeping tabs on how your work progresses.

akiarie · 2024-04-06T04:04:37 1712376277

Since we are not academics, I don't see why we're under obligation to do a comprehensive literature review, especially early on. We aren't claiming to have invented these techniques. It should be obvious that Xr0 involves very little in the way of new ideas. Our primary reference has been the work of Dijkstra.

What is significant about Xr0 is how it applies these old ideas. Xr0 is designed for programmers rather than mathematicians, so to us syntax matters a great, great deal.

muldvarp · 2024-04-06T21:30:48 1712439048

> Since we are not academics, I don't see why we're under obligation to do a comprehensive literature review, especially early on.

I don't think you're under any obligation, I just think that not discussing prior art is a bad look.

> What is significant about Xr0 is how it applies these old ideas.

I don't see anything new about the application of these well-known ideas either. Especially nothing that could explain why Xr0 would be widely used, while Frama-C is extremely niche.

> Xr0 is designed for programmers rather than mathematicians, so to us syntax matters a great, great deal.

Proving (the correctness of code) is an inherently mathematical task. So if by "mathematician" you mean someone who does mathematics, then all of your users will be mathematicians.

akiarie · 2024-04-08T10:33:46 1712572426

> I don't think you're under any obligation, I just think that not discussing prior art is a bad look.

See [0] where our reliance on Dijkstra is transparent.

> I don't see anything new about the application of these well-known ideas either. Especially nothing that could explain why Xr0 would be widely used, while Frama-C is extremely niche.

Begin with the fact that Xr0 is written in C and feels like C. We're building it so that C programmers will be able to use it without explicitly learning about formal verification.

> Proving (the correctness of code) is an inherently mathematical task. So if by "mathematician" you mean someone who does mathematics, then all of your users will be mathematicians.

Absolutely true. But what I mean by "mathematician" is a self-conscious mathematician, or at least someone with a background in formal mathematics.

(Edit: forgot to add link.)

[0]: https://xr0.dev/dijkstra-paradox

muldvarp · 2024-04-08T13:10:59 1712581859

> See [0] where our reliance on Dijkstra is transparent.

Mentioning that you're vaguely inspired by Dijkstra is not the same as discussing prior art. And it's not like there isn't enough prior art about "safe C" out there [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].

> Begin with the fact that Xr0 is written in C and feels like C. We're building it so that C programmers will be able to use it without explicitly learning about formal verification.

Most of the people interested in formally verifying their C code will already know about formal verification and absolutely don't care about whether Xr0 is written in C or not.

> But what I mean by "mathematician" is a self-conscious mathematician, or at least someone with a background in formal mathematics.

I'm doubtful Xr0 will be useful for someone without a background in formal verification.

To me, the whole project seems like vapourware. It currently works on an extremely limited subset of C. In fact, this subset is so limited that from a computation theory standpoint, the current capabilities of Xr0 are trivial. Supporting while loops (or general recursion) on the other hand is impossible (from a computation theory standpoint).

Even on this small subset of C it took me five minutes to get an out of memory error and a further five minutes to craft a C program with a double free that passes verification [10].

All this combined with a website that makes great promises about how awesome Xr0 is (or rather going to be) and handwavy explanations about how easy adding the missing features will be that don't hold up to any scrutiny. Dozens of similar projects exist, few of them work at all on non-trivial C programs and all of them require herculean effort (and are thus only used for safety critical software) to correctly annotate C code (often this includes rewriting parts of the code to make the annotations simpler or unnecessary). There is no discussion at all how Xr0 is going to solve the problems that killed these projects.

[0]: https://link.springer.com/chapter/10.1007/978-3-642-03359-9_...

[1]: https://link.springer.com/chapter/10.1007/978-3-540-71316-6_...

[2]: https://link.springer.com/chapter/10.1007/978-3-642-33826-7_...

[3]: https://ieeexplore.ieee.org/document/8543387

[4]: https://dl.acm.org/doi/10.1145/2594291.2594296

[5]: https://link.springer.com/chapter/10.1007/978-3-642-32347-8_...

[6]: https://link.springer.com/chapter/10.1007/978-3-642-20398-5_...

[7]: https://dl.acm.org/doi/10.1145/503272.503286

[8]: https://www.sciencedirect.com/science/article/pii/S157106611...

[9]: https://dl.acm.org/doi/abs/10.1145/3453483.3454036

[10]: https://pastebin.com/raw/JbzMj2Bg

akiarie · 2024-04-09T09:20:45 1712654445

> Even on this small subset of C it took me five minutes to get an out of memory error and a further five minutes to craft a C program with a double free that passes verification [10].

Nice work with the program. However, the only reason it verifies is we haven't yet implemented `||`, so in the parser the phrase `if (x || !y)` is being interpreted as `if (x)` (see [0] for the relevant production). This may seem like another frustrating indiction of how "extremely limited" the subset of C that Xr0 works on is, and indeed it is. But the most important point is there is no logical flaw in Xr0's design here, and no reason whatever that an error of this kind wouldn't be detected. [1] is an equivalent program that shows the logical cogency of our approach (I know it's ugly!).

> I'm doubtful Xr0 will be useful for someone without a background in formal verification.

> Most of the people interested in formally verifying their C code will already know about formal verification and absolutely don't care about whether Xr0 is written in C or not.

We aren't making Xr0 for people (self-consciously) interested in formal verification, but for C programmers interested in safety of the kind that Rust provides. And I can tell you that C programmers prefer their tools in C.

> All this combined with a website that makes great promises about how awesome Xr0 is (or rather going to be) and handwavy explanations about how easy adding the missing features will be that don't hold up to any scrutiny. Dozens of similar projects exist, few of them work at all on non-trivial C programs and all of them require herculean effort (and are thus only used for safety critical software) to correctly annotate C code (often this includes rewriting parts of the code to make the annotations simpler or unnecessary). There is no discussion at all how Xr0 is going to solve the problems that killed these projects.

Fair enough. The proof is in the pudding. Give us a few months. But understand that we have limited resources (this is an open source project and we're working on it part-time). We could either stop and make long-form arguments with full bibliographies or focus on building Xr0 into what we say it's going to be.

[0]: https://git.sr.ht/~lbnz/xr0/tree/master/item/src/ast/gram.y#...

[1]: https://pastebin.com/raw/7EpcUVzv

muldvarp · 2024-04-09T09:45:53 1712655953

> However, the only reason it verifies is we haven't yet implemented `||`

You should error on constructs you don't yet support. Not doing so makes it very difficult to ascertain how well Xr0 works.

> We could either stop and make long-form arguments with full bibliographies or focus on building Xr0 into what we say it's going to be.

You didn't do either though. You wrote long-form arguments about how awesome Xr0 is (or going to be) and how groundbreaking the idea of "interface formality" is. If instead you actually produced a working prototype (or even any technical argument why such a prototype is feasible) I'd be way less doubtful.

akiarie · 2024-04-09T10:32:39 1712658759

> You should error on constructs you don't yet support. Not doing so makes it very difficult to ascertain how well Xr0 works.

You're right. We should. (We will be adding this as we are able.)

> You didn't do either though. You wrote long-form arguments about how awesome Xr0 is (or going to be) and how groundbreaking the idea of "interface formality" is. If instead you actually produced a working prototype (or even any technical argument why such a prototype is feasible) I'd be way less doubtful.

A 7-point blog post is hardly "long-form" :)

But more to the point, Rust has already achieved this "interface formality", as we state in the second paragraph of this "long-form" piece. That achievement on the part of Rust is indeed groundbreaking. Is it so crazy to claim that this can be replicated for C, without encumbering it with ownership considerations that have no relationship to interface formality?

Fair enough. We have a much more generous definition of "working prototype" than you do. Hopefully one day soon we'll have something that clears yours.

muldvarp · 2024-04-09T11:24:37 1712661877

> Is it so crazy to claim that this can be replicated for C, without encumbering it with ownership considerations that have no relationship to interface formality?

You're not replicating what Rust does, you're replicating what Frama-C does.

Then again, even just replicating what Rust does would be extremely difficult (if not impossible) given the pervasiveness of undefined behavior in C.

> Fair enough. We have a much more generous definition of "working prototype" than you do.

How many useful C programs do you know that require no loops and no binary operators?

akiarie · 2024-04-09T11:38:33 1712662713

> Xr0 1.0.0 will enable programming in C with no undefined behaviour, but for now it's useful for verifying sections of programs.

Literally on the website. The purpose of the prototype is to show the feasibility of the approach we've taken, not to work on whole programs.

muldvarp · 2024-04-09T11:48:32 1712663312

> The purpose of the prototype is to show the feasibility of the approach we've taken

But it doesn't do that. You haven't shown that this approach is able to deal with loops and loops are pretty fundamental to every C program.

Heck, you haven't technically shown that this approach can deal with binary operators. I would love to turn Xr0 into an improptu SAT solver, but without `&&` and `||` that sadly isn't possible right now.

akiarie · 2024-04-09T13:48:51 1712670531

You missed the `!=`, `==` (in many instances), bitwise operators, recursion and `goto`. (And much, much more.)

Xr0 is a work in progress, and we'll get there step-by-step.

muldvarp · 2024-04-09T15:09:54 1712675394

Yes, the subset of C it currently supports is so small I'm really puzzled what you think this prototype is proving. That you can parse C?

akiarie · 2024-04-09T15:42:07 1712677327

It's even less impressive than that. That we can parse a subset of C. That's all there is to see here.

muldvarp · 2024-04-09T15:53:06 1712677986

Then why not write a blog post "We can parse C" instead of "Xr0 makes C safe (it doesn't actually and we are unable or unwilling to go into details and also please don't ask us about Frama-C)"?

akiarie · 2024-04-11T12:28:35 1712838515

Shouldn't it be called "We can't parse C" since we can only parse a trivial subset of it?

akiarie · 2024-04-05T09:15:03 1712308503

Good spot. We hadn't enabled `else` last week when we posted, but we've just fixed this [0].

[0] https://github.com/xr0-org/xr0/releases/tag/v0.15.0

gpderetta · 2024-04-05T09:38:08 1712309888

Very nice. I was worried that conditionals would confuse it. But this passes:

void x(int f) { int * p; p = malloc(1); if(f) { free(p); } if(!f) { free(p); } }

And this doesn't, as expected:

void x(int f) { int * p; p = malloc(1); if(f) { free(p); } if(f) { free(p); } }

edit: another assertion failure:

  void x(int f) {
   int * p;
   p = malloc(1);
   if(f) {
      free(p);
   }
   f = !f;
   if(f) {
      free(p);
   }
  }

akiarie · 2024-04-05T11:11:42 1712315502

This is another good spot. Xr0's logical/arithmetic analysis is currently limited because we've been more focused on getting the structures of the language working as a whole. So you'll get more of these asserts if you try using `&&`, ternaries, and so on.

We've put a fix in for this on a dev branch for now though: https://github.com/xr0-org/xr0/issues/43

gpderetta · 2024-04-05T12:05:29 1712318729

This seems very hard in general to fix:

  void x(int a, int b) {
   int * p;
   p = malloc(1);
   if(a) {
      free(p);
   }
   if(b) {
      free(p);
   }
  }

It is possible that the program always guarantees that a!=b, but it would be hard to prove even with whole program analysis. Are you planning additional annotations or this is just not expected to work?

mst · 2024-04-05T14:34:25 1712327665

I can't speak for the authors but I think I'd file that one under "and if you write this, ideally a small gnome exits the back of the computer and hits the developer with a mallet."

I'd be perfectly happy refactoring that so I could have an 'a or b' test (I'm assuming $something means that x(0, 0) not free-ing is correct in real code because nitpicking an illustrative example would be silly).

Though note that I'm sufficiently bad at tracking things right when writing C that I tend to spend most of my debugging time figuring out what stupid mistake I made to cause the current segfault - so I'm probably more willing than average to write code differently if it helps me avoid that ;)

akiarie · 2024-04-05T18:22:47 1712341367

This is another good find. We haven't yet implemented the `!=` operator (or most of the binary operators).

Incidentally, you will notice that Xr0 rejects the program as is, which is correct (because of the double free). It can, however, accept code like the following:

  void
  foo(int a, int b)
  {
          int *p;
          p = malloc(1);
          if (a) {
                  if (!b) {
                          free(p);
                  }
          }
          if (b) {
                  free(p);
          }
          if (!a) {
                  if (!b) {
                          free(p);
                  }
          }
  }

I say "can" because we only have this working on a development branch [0], and there are still some issues we're resolving on this, hopefully by next week.

[0]: https://github.com/xr0-org/xr0/tree/fix/nested-branching

gpderetta · 2024-04-05T20:27:15 1712348835

Ideally you would express the precondition that a and b are mutually exclusive like this:

void x(int a, int b) { if(! (a != b)) abort();

   int * p;
   p = malloc(1);

   
   if(a) {
      free(p);
   }
   if(b) {
      free(p);
   }
  }

And the program would be accepted. I'm on my phone and can't verify it though.

Great stuff anyway!

Edit: is the algorithm documented anyway? And there is any reason why you guys are not implementing it as a clang/GCC plugin and using the new C and C++ annotation syntax?

akiarie · 2024-04-05T20:49:49 1712350189

Yes, the code example you've given is exactly in the direction we're thinking. (Edit: thought it can't be verified yet.)

We haven't documented the algorithm because it's not the unique part of what we're doing. It works in a very simple-minded way, analysing possible states.

The interesting part is the insight that function interfaces are the boundary along which verification should take place. Applying this idea consistently we escape the combinatorial explosion which tends to plague efforts in this area, because every function can be verified by itself. Notably, Dafny recognised this point before us, and the earliest clear statement we can find regarding it is in Dijkstra's "Notes on Structured Programming", in his chapter "On a Program Model". You may find that illuminating read.

I'm no expert on Clang/GCC and the new annotation syntax for C and C++, but from what I've seen it's not general enough for what we're trying to achieve in Xr0. The safety semantics which are expressed in annotations in Xr0 (see [0] for an example) get very detailed.

[0]: https://xr0.dev/learn#initialisation-of-memory-and-the-clump

gpderetta · 2024-04-05T21:40:08 1712353208

Will take a look at your reference. Stroustup and Sutter have tried annotating every function, but their declarative function wasn't expressive enough. Adding arbitrarily expressions might help though.

Re annotations, I meant the attribute syntax. You can have arbitrary custom syntax in it:

  void foo(int *p) [[vx0::setup(p = .clump(sizeof(int))), vx0::body(*p = 0)]] {...}

akiarie · 2024-04-05T08:13:43 1712304823

A lot of truth to this. (Speaking as one of the creators of Xr0).

akiarie · 2024-03-30T05:36:49 1711777009

> Already from the meat of the article, it seems that this solution is setting you up for using a complex language to specify the interface, which immediately begs the question of how you can be sure the interface is correctly specified.

This is insightful. That is exactly what is happening, but our sense is that this language will be no more complicated than C itself, and will to a large extent parallel it.

The question is really what is more effective, working under one set of constraints imposed universally (like the ownership semantics in Rust), or taking upon yourself the responsibility of defining such constraints within each program. In this way what we're doing is an application of the "trust the programmer" design philosophy; trust him, not only to know what he's doing, but to be able to express it (in the interface specification).

> Sure, in a three-line function, it's kind of obvious (if you know to be aware of this possibility!), but in a function where the pointer and the use are separated by hundreds of lines of code, it's easy to miss that there is a requirement that the hash table not be resized between the two calls. With Rust, the fact that &mut for V means that I can't touch the hash table until I'm finished with V makes it a compiler error.

It can seem as though the example we provide is contrived (because of course it is), but it is not mere contrivance. The point we stress the most is not that the bug is obvious if there are only three lines, but that the presence or absence of a bug is impossible to see even with a single function call whose interface is informal. So what is remarkable is not that we can write three lines of code that have or do not have an obvious error, but that by the mere fact of the interfaces being informal, a three-lined function can be impossible to analyse by itself.

> The ability to mutate an object necessarily implies the ability to mutate the state of an object to break guarantees. While there are definitely patterns where multiple mutable references to the same object can be safe, it is not inherently safe, and proving safety is far more challenging because you can't as easily rely on pre/postconditions for lemmas.

This doesn't answer the question. "it is not inherently safe" does not mean "it is inherently unsafe", it only means "it may be unsafe". So our point is that Rust precludes many kinds of programming that can be safe in relevant contexts, and this is what is so limiting.

muldvarp · 2024-03-30T17:01:15 1711818075

> That is exactly what is happening, but our sense is that this language will be no more complicated than C itself, and will to a large extent parallel it.

The point of interfaces is usually to hide underlying complexity. What's the point of interfaces that are just as complicated as their implementation?

> The question is really what is more effective, working under one set of constraints imposed universally (like the ownership semantics in Rust), or taking upon yourself the responsibility of defining such constraints within each program.

One of the things I personally like about Rust is that its rules are very simple and still capture 95% of all situations I find myself in. Adding the complexity of having to write your own formal specifications for the remaining 5% doesn't seem like it's worth the effort.

> So our point is that Rust precludes many kinds of programming that can be safe in relevant contexts, and this is what is so limiting.

Xr0 will also either 1) reject programs that are safe or 2) accept programs that are unsafe. An (computable) algorithm that decides for every C program whether it is safe or not is mathematically impossible. Start implementing loops and you'll quickly see why the idea of provable correctness without any restrictions is doomed to fail.

Not that similar projects don't exist: CBMC, Frama-C, ... How does Xr0 differ from those existing projects?

akiarie · 2024-04-08T10:41:39 1712572899

> The point of interfaces is usually to hide underlying complexity. What's the point of interfaces that are just as complicated as their implementation?

This is a pivotal question and our answer to it exposes one of our basic assumptions. Interfaces do hide underlying complexity, but that is not all that they do. They define our interaction with abstractions, which not only hide complexity, but reduce it by "abstracting". So what we find in practice is that the interfaces are not as complicated as the code, because all they define is what the higher levels of abstraction need to know to interact abstractly with the lower levels.

> One of the things I personally like about Rust is that its rules are very simple and still capture 95% of all situations I find myself in. Adding the complexity of having to write your own formal specifications for the remaining 5% doesn't seem like it's worth the effort.

Well this is great. If you like Rust, there's nothing wrong with that. But many C programmers don't feel this way.

> Xr0 will also either 1) reject programs that are safe or 2) accept programs that are unsafe. An (computable) algorithm that decides for every C program whether it is safe or not is mathematically impossible. Start implementing loops and you'll quickly see why the idea of provable correctness without any restrictions is doomed to fail.

Insightful again! This is all true. Note, however, that we aren't claiming not to impose restrictions. We're claiming that the restrictions should concentrate at a different place than where they concentrate in Rust. They should orbit around the interfaces. Xr0 is certainly imposing restrictions there.

Moreover, we are in no way suggesting/claiming that we can deduce the safety of arbitrary C programs, which is the whole reason the annotations are required. We're claiming that we can help programmers who already understand why their code is safe express these reasons in relatively-simple annotations, and use these to verify their reasoning.

Edit: typo.

muldvarp · 2024-04-08T16:42:07 1712594527

> We're claiming that the restrictions should concentrate at a different place than where they concentrate in Rust. They should orbit around the interfaces.

Rust's restrictions can also be interpreted as restrictions on interfaces. A function taking (a non-`Copy` type) by value is a function that Xr0 would annotate as `free`ing that variable, for example.

> Moreover, we are in no way suggesting/claiming that we can deduce the safety of arbitrary C programs, which is the whole reason the annotations are required.

You are claiming that Xr0 will deduce that a function matches its annotations. If those annotations are "this function has property P", you'll have to deduce whether the function actually has property P. This can be augmented by inline annotations (i.e. annotations for code, not interfaces) and those will likely become necessary once you implement loops. But at that point you'll have reinvented Frama-C and you'll realize why Frama-C isn't used outside of safety-critical software: It's both hard and exhausting to use.

akiarie · 2024-04-09T08:33:45 1712651625

> Rust's restrictions can also be interpreted as restrictions on interfaces. A function taking (a non-`Copy` type) by value is a function that Xr0 would annotate as `free`ing that variable, for example.

Yes they can, and we said this in the article, "The remarkable safety guarantees that Rust provides are consequent to something much deeper than its ownership rules: Rust empowers programmers to formalise and verify the semantics of pointer usage at the interface boundary between functions. If interface formality is the fundamental determinant of safety, systems language design should focus directly on it rather than on any other criterion, including ownership."

The point we're stressing, though, is that the restrictions in Rust do not focus on the interfaces, but on the ownership principle. So we're advocating for language design to ignore ownership considerations and deal with the interfaces as the important point.

> You are claiming that Xr0 will deduce that a function matches its annotations. If those annotations are "this function has property P", you'll have to deduce whether the function actually has property P. This can be augmented by inline annotations (i.e. annotations for code, not interfaces) and those will likely become necessary once you implement loops. But at that point you'll have reinvented Frama-C and you'll realize why Frama-C isn't used outside of safety-critical software: It's both hard and exhausting to use.

Again, I agree with most of this (except the cynicism). We have done some serious experiments with implementing loops and immediately saw that we would need to annotate the loops also (with at least some kind of invariants).

The comparison to Rust is actually a whole lot more helpful than that to Frama-C. Why is Rust able to deliver safety guarantees without becoming "hard and exhausting"? Our argument is that it is the interface formality issue that really determines safety. If this is so, why should it be "hard and exhausting" to be empowered with the ability to be formal at the interfaces concerning the things that influence safety?

The fact that we're using annotations does not mean that Xr0 is equivalent to Frama-C. To point to one important difference, we're focused on securing safety, whereas Frama-C is a general purpose tool for arbitrary correctness considerations. If Xr0 does grow to deal with such arbitrary correctness concerns, it is possible that it will become "hard and exhausting" like Frama-C, because of the inherent difficulty of complete verification. But while we're focused on safety alone, there's no reason why we can't offer something that is at most of the same difficulty level of working in Rust. Xr0's annotations are not "this function has property P" where P is arbitrary. They are "this function has well-defined behaviour according to the C standard under these circumstances". That is a much more limited kind of condition to validate.

muldvarp · 2024-04-09T09:29:34 1712654974

> Our argument is that it is the interface formality issue that really determines safety.

Frama-C also does "interface formality". Chapter 3 (i.e. the first chapter after the one telling you how to install Frama-C) of the Frama-C WP tutorial is literally called "function contracts" and explains how to use ACSL to annotate pre- and post-conditions on functions.

> The comparison to Rust is actually a whole lot more helpful than that to Frama-C.

Your approach is way more similar to Frama-C than to Rust. The difference between Rust and Frama-C is that Frama-C allows you to specify arbitrary constraints, while Rust only allows you to specify very few and specific constraints. That may feel limiting (as you have pointed out) but that's also what makes it possible to deduce whether the function actually satisfies these constraints without having to annotate every loop (which is tedious for simple loops and hard for complex ones).

> If this is so, why should it be "hard and exhausting" to be empowered with the ability to be formal at the interfaces concerning the things that influence safety?

Annotating functions can be hard if the pre- or post-conditions of that function are hard to formalize (e.g. "this function returns the first element of a loop in the given linked list or null if no loop exists"), but having to annotate interfaces isn't what makes Frama-C hard to use. Frama-C is hard to use because it is hard to get it to accept that the function actually satisfies the function contract and this is a problem you will also face.

> To point to one important difference, we're focused on securing safety, whereas Frama-C is a general purpose tool for arbitrary correctness considerations. [...] They are "this function has well-defined behaviour according to the C standard under these circumstances". That is a much more limited kind of condition to validate.

Undefined behavior is extremely pervasive in C. If one of Xr0's goals is to verify that a C program does not exhibit undefined behavior (and it has to, given that undefined behavior is anything but safe) then you'll have to verify properties such as "this addition doesn't overflow", "this value is never INT_MIN", "this value is never negative", "this value is never zero", "a is less than b", ...

There will barely be any difference between verifying that a function has "well-defined behavior under these circumstances" and full verification.

naasking · 2024-04-04T23:50:30 1712274630

> The point of interfaces is usually to hide underlying complexity

Specifications are typically less complex than implementations, so it won't be just as complex. One purpose of specifications is to encapsulate, but another purpose can be to specify a contract/protocol.

akiarie · 2024-03-29T19:32:14 1711740734

The answer is simple: C isn't going anywhere. Some programmers are satisfied with Rust, and let them enjoy themselves, for as we said, Rust is a great advance for systems programming. But many C programmers, in particular, are unsatisfied with its complexity, as are we, so we're building Xr0 for them (and us).