> I think this a great way of presenting type features that interact! props to the author for cooking it up
It's a great way to present it, but like Scala's infamous "Periodic Table of Dispatch Operators", stuff like is rather off putting. Why use crude sigils when you could just as well use easily understood keywords (like ref, out, unsafe etc. in C-Sharp)?
It's nice to have very common features be syntactically lightweight, and sigils are incredibly lightweight compared to two-or-three character keywords. Rust code uses these pointers in all but the most trivial code, so they'll be showing up a lot; in contrast, I suspect as a non-C# programmer that ref and out are not as widely used in most C# codebases as ~ and & are in Rust. There are also not that many sigils, all told—three sigils for pointers plus a few other operators that appear in this table.
Additionally, this table is different from the tables of Scala dispatch operators, Perl operators, &c because of its regularity—it has two axes and a good deal of the information presented is a straightforward function of a cell's position in the table[1]. For example, the entry in the Owned column and the String row is going to be an... owned string, which is going to be prefixed with ~ (like every cell in the Owned column) and have a base type of str (like every cell in the String row). This is a far cry from the table of Scala's dispatch operators, in which there's no consistent indication of what a given operator will necessarily do aside from a general grouping of like operators.
[^1, edit]: That isn't to say that ALL the information is a (predictable) function of a cell's position in the table. Some of that irregularity comes from arguably expected consequences of Rust semantics (e.g. of course you can't have a bare string, because you can only have bare values whose sizes are known in advance, and you don't know how long a string will be) while some is genuinely arbitrary, like the syntactic sugar for functions.
Re: the dispatch periodic table. This is from a library in Scala (a web client called dispatch) that was someone's first project in Scala and went very overboard on the use of operators. It is generally considered bad design.
Just to avoid any misunderstandings: The linked page is a (very nicely done) joke. These are not Scala operators, but (by now retired) operators of a particular third party library for Http dispatch.
That's a shame, I happen to like the use of sigils to denote pointer types, to my eyes it makes code a lot easier to read.
Out of curiosity why are they making the change? And will they add the `let x = box Thing` syntax or as an optional addition to the language or will it fully replace the old `let = ~Thing` syntax?
I think the idea is to make heap allocation taxing to type. Basically the sigil is too easy to type, so its easy for application to just allocates a whup ass of memory. So, this way it might force you to think about each allocation.
There are some trends. Types to the right are "subsets" of types to the left; e.g. a &mut T can coerce to a &T, similarly, you can box a T into a ~T, or take references to get an &mut T or &T.
periodicity refers to (originally) that the trends are cyclical with respect to atomic mass - we now know that it's with respect to atomic number and that the period increases (2 - 8 - 18 - much more) but nonetheless between multiple rows there's an increasing valence. Hence there is a repeating aspect 'period' driven by a connection between row n and n+1 (row n+1 immediately follows row n in atomic number). There's no connection between row n and row n+1 in this table of rust types. The vertical organization is completely arbitrary, there is no ascending value across the table, and therefore it's not periodic.
That was my direct inspiration, though I suck at graphics design. Fortunately the periodic table of Rust types was much concise than that of Perl 6 operators...
None of these are competing in the same space as Rust: fine-grained memory control (read: perf on par w/ C++) with zero-cost abstractions for safety.
You can certainly argue that all of those languages are as safe as Rust, with the lack of nulls and explicit mutability (taken to a new level in Haskel), but you can't say they expose a memory models that actually reflects the underlying system (and are as tunable) to the extent that Rust does.
Rust implements a kind of region typing or linear logic, too, right? That's significantly beyond anything you'll see in garden variety Haskell/ML (though you can embed linear logic in the Haskell type-class machinery with a final encoding [0]).
Yes to both. Lifetimes are basically regions, and unique types are basically affine types. (We usually stick to the C++ terminology, though, for familiarity's sake.)
This is true. However, Rust is intended to be a systems language, which among other things means no, or at most optional, garbage collection and types for unboxed values.
Like Java generics and subtyping, any given part may be simple on its own, but the combination is not.
Rust tracks lifetimes for stack- and dynamically-allocated values as part of the type system; hence "owned" pointers. Which are horrible by themselves; hence "borrowed" pointers and the resulting lifetime complications. Rust includes traits, which are similar to Haskell type classes and are very nice. However, they come with a heapin' helping of their own complexity.
And so forth. Rust is aiming for a sweet spot somewhere between a relatively-simple Hindley-Milner[-Damas] parametric polymorphism and full-on dependent typing. So far, I think it hits a pretty nice local optimum.
I'd love if magical syntax fairy would make Rust easy to read as Python, but I think the complexity fits the domain. Safe manual memory handling is probably gonna look weird because of guarantees it must make.
I'm a massive Nimrod fan, but those safety guarantees are void once you use manual memory management. What Rust gives you is safety with memory management. Nimrods GC and type system are both really powerful, but Araq and I were chatting about Rusts type system on irc for a reason...
I'm probably in the minority here, but I think the answer is "so use a GC". The number of people on earth that I would trust with my life to correctly manage memory in a complex application is very small.
You don't need to trust people using Rust, you just trust the type system.
Admittedly, the memory safety features of the type system haven't been formally verified, but this is a goal, and there is a rather large piece of in-source documentation: http://static.rust-lang.org/doc/master/rustc/middle/borrowck... (I haven't read it, so I have no idea if it will make any sense to someone who doesn't know Rust.)
Rust's type system statically guarantees that memory will never be accessed after it is freed, so no segfaults, no use-after-frees, no double-frees, no iterator invalidation, etc. It also statically eliminates data races between concurrent tasks.
In the past Rust has attempted to use the type system to prevent memory leaks in certain cases, but the features that attempted to do so were deemed overly restrictive to use for practical purposes. Nowadays I'm sure it's possible to leak memory if you try. Honestly I've never heard of a Turing-complete language whose type system can provide such a guarantee.
Doesn't it make a stronger guarantee, that you cannot cause an invalid dereference? In addition to what you mentioned, this would also cover bounds-checking, trying to dereference a pointer that was never allocated, etc.
Also does it enforce that memory is consistently used as a single type? Can you allocate a byte array and then cast it to an appropriately sized array of integers?
> Doesn't it make a stronger guarantee, that you cannot
> cause an invalid dereference?
I'm not knowledgeable enough to answer that question precisely.
However, I can tell you that Rust's type system is not strong enough to obviate bounds checking. I hear you'd need something like Idris' dependent types for that. Rust bounds checks arrays dynamically (there are `unsafe` functions available to index an array without bounds checks), and avoids bounds checking on arithmetic by guaranteeing that fixed-sized integers will wrap on overflow (which is gross, but might be changed to fail-on-overflow if it doesn't hurt performance too much).
> Can you allocate a byte array and then cast it to an
> appropriately sized array of integers?
You can't do this in safe code, but you can in `unsafe` code via the `std::cast::transmute` function, which does still enforce that both types are the same size.
> However, I can tell you that Rust's type system is not strong enough to obviate bounds checking.
That's a bummer. It seems doable, but maybe it is too complex.
> avoids bounds checking on arithmetic by guaranteeing that fixed-sized integers will wrap on overflow (which is gross, but might be changed to fail-on-overflow if it doesn't hurt performance too much).
That would be nice as a default, but I'd be afraid it would hurt performance too much for numerical code. You'd definitely want a way to express arithmetic should be allowed to overflow (ie. that omits the check).
On a related note, one thing that is sorely missing in C and C++ is a way to test whether a value will overflow when converted to a different type (I wrote a blog article about this point: http://blog.reverberate.org/2012/12/testing-for-integer-over...)
> That's a bummer. It seems doable, but maybe it is too complex.
In general, eliminating runtime bounds checking is solving the halting problem.
let v = [1, 2, 3];
if halts(some_program) { v[1000] } else { v[0] }
Of course, this doesn't meant that it's impossible in a subset of cases, e.g. people are doing work on fast range analysis for LLVM, which would be able to remove some of the bounds checks sometimes: http://code.google.com/p/range-analysis/ (that analysis also applies to the arithmetic, and apparently only makes things single-digit percent slower (3% iirc).)
> In general, eliminating runtime bounds checking is solving the halting problem.
Your example does not convince me that this follows. In cases where the compiler cannot statically prove which branch will be taken, I would expect it to require that either path can be executed without error (so in your example compilation would fail). But you could use static range propagation to allow code like this (given in C++ syntax since I'm a Rust novice):
void f(int x[], unsigned int n) {
n = min(len(x), n);
for (int i = 0; i < n; i++) {
x[i];
}
}
Maybe not the greatest example since I would hope that Rust+LLVM would hoist its internal bounds-check to emit something very much like the above anyway. I guess intuitively I just hope that there's more that can be done when it comes to static bounds-checking.
Well, if you throw the halting problem at it then all of your static analysis goes away since you can use general recursion to write arbitrarily typed expressions. That's why things like Idris have termination checkers.
You can't even say that about languages _with_ a GC. At least not as long as you use the practical definition of "leaks memory", which is that the memory remains alive until the application shuts down. Here's a simple example of a memory leak in JS in that sense (modulo nontrivial manual cleanup, obviously):
window[Math.random()] = new Array(100000);
A much more interesting question, in some ways, is what guarantees you have about not accessing no-longer-alive objects. That's where Rust has some serious advantages over C++, say.
There are many problems where a GC is too intrusive (one notable one is: writing a GC). Until you acknowledge this, Rust will make little sense to you (nor will C or C++, for that matter).
The whole problem Rust is trying to solve is that a programmer can do manual memory management without anyone needing to trust that they have gotten it right. The compiler can automatically check correctness (except for unsafe blocks, which are kept to a minimum).
Symbols have been systematically stripped from the language. There really aren't a significantly large number of them anymore, at least in comparison to other curly-brace languages.
You can make a similar table with C++, complete with pointer, pointer-to-member, function pointer, member function pointer, reference, r-value reference, array, std::auto_ptr, std::unique_ptr, std::shared_ptr, const and volatile. :)
Err, I'm a big fan of Rust and all, but I'm pretty sure c++ does have const and so on. What c++ lacks is a difference between owned and borrowed pointers, except by convention and so on.
Ah, but C++ "const" doesn't do what it says on the tin! What "const" means is not "constant", but "read-only". Something that's const to you might not be const to something else, so you can never depend on it staying the same.
I may be wrong, but my understanding is that Rust's constants are actually constant, and proved as such by the compiler, which is a major difference over C++.
That's right. If you have an `&` reference to something, the language enforces that it will never be mutated as long as that reference is alive. (Also, if you have an `&mut` reference to something, the language enforces that you're the only one who can mutate it while that reference is alive; that's how iterator invalidation is prevented.)
I've often wished for this in C and C++, particularly from an optimization perspective. I always hated that the optimizer cannot assume that the the contents of a struct to which I have a const pointer won't change out from under me when I call another function. It means that it cannot cache members of this struct in callee-save registers; it has to reload them every time.
I don't know how much of a speed difference it would actually make in practice, but it bothers me that I cannot express this in C++.
Rust's "immutable" references ensure that it cannot be changed from either the reference holder or other safe code that has a reference to the same memory region. Note that (as the periodic table suggests) "mutable" references can be downgraded to the immutable references, but the mutable references are locked while the immutable references are active. Once the immutable references are gone (this is checked by the compiler, see the lifetime guide for details) the mutable references can be used again.
C++ gives us escape hatches all over the place. I think that the modern approach of compartmentalizing all escape hatches into explicitly regions of "unsafe" code, and providing no escape hatches outside of such regions, is much better.
C++ const doesn't actually provide constness for purposes of concurrent access unless you follow a bunch of other rules (no const_cast, no use of "mutable", no use of mutable class statics, no use of mutable file-level statics, no use of mutable globals) that in practice people violate all the time even with const objects.
The main purpose of Rust is to be a practical, safe alternative to C++. Go and D are NOT such languages, because of the mandatory garbage collection. From time to time there's been efforts to replace C++ with safer garbage collected languages such as Java or C#, and every time it basically failed. Direct control over memory has been proven pretty much essential in high-performance software. Thus, we're still stuck with C++, and we're still suffering for it, having nightmarish bugs and myriads of security holes and whatnot.
Rust at least tries to make an honest effort to change that. If and when it gets really fleshed out it'd be a boon to all those poor developers. And please forget your awfully subjective qualms about syntax (seriously, go and program in Haskell if you're all for pretty syntactic and cohesiveness - though I can't guarantee you won't find it "disgusting"). Memory management in the C++ way is bloody hard, and it's a dirty affair. It makes no sense to be repulsed by all the complexity there, because it's unavoidable, and it's a job that has to be done. Of course there's going to be a smattering of syntactical constructs to make memory management less tedious, just like there is extra syntactic baggage on bigger Haskell programs to accommodate the fine control over side effects.
Having a memory safe high-performance language with C++-like memory management is a HUGE thing. A while ago I would have dismissed such a thing as a pipe dream, and when I became convinced that Rust could work it was a moment of great rejoicing. Even if Rust ends up on the dumps, there has been at least an effort, and maybe someone will make a better Rust some day.
So please try to understand the purpose and rationale behind Rust before dismissing it in a superficial way.
"When you enter SafeD, you leave your pointers, unchecked casts and unions at the door. Memory management is provided to you courtesy of Garbage Collection."
I don't see how you make the leap from a syntax you don't like, to the language lacking cohesion. In my experience Rust code looks perfectly cohesive, and once you have the sigils in your head you simply parse them as what they mean.
To give an example, I don't believe that this code:
If you don't need a systems language than for christ's sake don't use or learn one. Go use your go. shoo. Also, you care about syntax, others care about safety and correctness. It's not your cup of tea but no need to be all "this is a horrible language" crap. I dare you to have a face to face discussion with anyone on the development team and see if you'll talk so big then.
> Go is also a systems programming language so I don't know what you mean. I would like to use Go, yes, and I do when I have a chance.
Go is a fully garbage collected language that doesn't offer a lot of support for manual memory management (and that isn't intended to attack Go—it just wasn't one of its goals). Rust is a low-level systems language that allows full control via zero-cost abstractions, but is safe. Achieving that goal requires some new machinery and syntax. Systems programming often entails writing for environments in which you can't use a GC (or even any runtime at all) for performance or other reasons.
> Also your point about syntax vs safety and correctness makes no sense, since it implies that those are mutually exclusive so I'm just going to ignore that one.
Rust has a unique feature—manual memory management with safety—so it needs syntax for it. It's not just syntax for its own sake—it's what gives the language a niche all its own.