> We aim to make Meilisearch updates seamless. Our vision includes avoiding dumps for non-major updates and reserving them only for significant version changes.
We will implement a system to version internal databases and structures. With this, Meilisearch can read and convert older database versions to the latest format on the fly. This transition will make the whole engine resource-based, and @dureuill is driving this initiative.
Seamless upgrades has been my dream for Meili for a while, I'm still hoping I can smuggle it with the indexing refactor itself :-)
I'm not sure I understand the use case here. Are you asking if you can depend on two versions of the same crate, for a crate that exports a `#[no_mangle]` or `#[export_name]` function?
I guess you could slap a `#[used]` attribute on your exported functions, and use their mangled name to call them with dlopen, but that would be unwieldy and guessing the disambiguator used by the compiler error prone to impossible.
Other than that, you cannot. What you can do is define the `#[no_mangle]` or `#[export_name]` function at the top-level of your shared library. It makes sense to have a single crate bear the responsibility of exporting the interface of your shared library.
I wish Rust would enforce that, but the shared library story in Rust is subpar.
Fortunately it never actually comes into play, as the ecosystem relies on static linking
> I'm not sure I understand the use case here. Are you asking if you can depend on two versions of the same crate, for a crate that exports a `#[no_mangle]` or `#[export_name]` function?
Yes, exactly.
> Other than that, you cannot.
so, to the question "Can a Rust binary use incompatible versions of the same library?", then the answer is definitely "no". It's not yes if it cannot cover one of the most basic use cases when making OS-native software.
To be clear: no language targeting OS-native dynamic libraries can solve this, the problem is in how PE and ELF works.
I agree the answer is no in the abstract, but it is not very useful in practice.
Nobody writing Rust needs to cover this "basic use case" you're referring to, so it is the same as people saying "unsafe exists so Rust is no safer than C++". In theory that's true, in practice in 18 months, 604 commits and 60,008 LoC I wrote `unsafe` exactly twice. Once for memory mapping something, once for skipping UTF-8 validation that I'd just done before (I guess I should have benchmarked that one as it is probably premature).
In practice when developing Rust software at a certain scale you will mix and match incompatible library versions in your project, and it will not be an issue. Our project has 44 dependencies with conflicting versions, one of which appears in 4 incompatible versions, and it compiles and runs perfectly fine. In other languages I used (C++, python), this exact same thing has been a problem, and it is not in Rust. This is what the article is referring to
Starts with an interesting claim "don't optimize for typing", but then it completely fails to prove it, and confuses itself in thinking that `auto` is an optimization for typing.
`auto` is:
- A way to express types that are impossible or truly difficult to express, such as iterators, lambdas, etc
- A way to optimize reading, by limiting the redundancy
- A way to optimize maintenance, by limiting the amount of change brought by a refactor
The insistence on notepad or "dumb editors" is also difficult to grasp. I expect people reviewing my code to be professionally equipped.
Lastly the example mostly fails to demonstrate the point.
- There's a point made on naming (distinct from `auto`): absent a wrapping type, `dataSizeInBytes` is better than `dataSize`. The best way though is to have `dataSize` be a `Bytes` type that supports conversion at its boundaries (can be initialized from bytes, MB, etc)
- What's the gain between:
auto dataSet = pDatabase->readData(queryResult.getValue());
The `dataset` part can be inferred from the naming of the variable, it is useless to repeat it. The `Dabatase` is also clear from the fact that we read data from a db. Also, knowing the variable has this specific type brings me absolutely nothing.
- Their point about mutability of the db data confused me, as it is not clear to me if I can modify a "shadow copy" (I suppose not?). I suggest they use a programming language where mutating something you should not it a compile time error, it is much more failsafe than naming (which is hard)
I'm sad, because indeed one shouldn't blindly optimize for typing, and I frequently find myself wondering when people tell me C++ is faster to write than Rust, when I (and others) empirically measured that completing a task, which is the interesting measure IMO, is twice as fast in the latter than in the former.
So I would have loved a defence of why more typing does not equate higher productivity. But this ain't it.
> empirically measured that completing a task... is twice as fast in [Rust] than in [C++]
I have not read up on which tasks you're referring to that are empirically measured, apologies. The reason I'm curious on what the tasks are, is that depending on the task, navigability may not matter.
For example, if the task is "build a tool that does X", then navigability of the code does not matter. Once built, the tool does X, and there's no reason to revisit the code, and thus no reason to navigate the code.
But if the task is "Given a tool that already does W, X, Y, make the tool also do X', Y', and Z", then navigability of the code matters. This is because the coder must understand what the tool already does, and where the changes need to be made.
Most of my professional life, (and I'm willing to bet, most other coders here as well) I more often find myself in the second task than the first.
But, I'm not interested in Rust vs C++. I'd be more interested in the results of "given a version that makes high use of type inference vs not, how quickly can someone new to the project add X', Y', and Z." That would be a more appropriate test for what the author describes here. And I'd imagine that probably, those that are using sufficiently advanced IDEs would beat out those without, regardless of if type inference used or not, and would probably be slightly faster when given the highly type-inferenced version.
Some good advice, some bad advice in here. This is necessarily going to be opinionated.
> Provide a development container
Generally unneeded. It is expected that a Rust project can build with cargo build, don't deviate from that. People can `git clone` and `code .`.
Now, a docker might be needed for deployment. As much as I personally dislike docker, at Meilisearch we are providing a Docker image, because our users use it.
This is hard to understand to me as a Rust dev, when we provide a single executable binary, but I'm not in devops and I guess they have good reason to prefer docker images.
> Use workspaces
Yes, definitely.
> Declare your dependencies at the workspace level
Maybe, when it makes sense. Some deps have distinct versions by design.
> Don't use cargo's default folder structure
*Do* use cargo's default folder structure, because it is the default. Please, don't be a special snowflake that decides to do things differently, even with a good reason. The described hierarchy would be super confusing for me as an outsider discovering the codebase. Meanwhile, vs code pretty much doesn't care that there's an intermediate `src` directory. Not naming the root of the crate `lib.rs` also makes it hard to actually find the root component of a crate. Please don't do this.
> Don't put any code in your mod.rs and lib.rs files
Not very useful. Modern IDEs like VsCode will let you define custom patterns so that you can match `<crate-name>/src/lib.rs` to `crate <crate-name>`. Even without doing this, a lot of the time your first interaction with a crate will be through docs.rs or a manual `cargo doc`, or even just the autocomplete of your IDE. Then, finding the definition of an item is just a matter of asking the IDE (or, grepping for the definition, which is easy to do in Rust since all definitions have a prefix keyword such as `struct`, `enum`, `trait` or `fn`).
> Provide a Makefile
Please don't do this! In my experience, Makefiles are brittle, push people towards non-portable scripts (since the Makefile uses a non-portable shell by default), `make` is absent by default in certain systems, ...
Strongly prefer just working with `cargo` where possible. If not possible, Rust has a design pattern called `cargo xtask`[1] that allows adding cargo subcommands that are specific to your project, by compiling a Rust executable that has a much higher probability to be portable and better documented. If you must, use `cargo xtask`.
> Closing words
I'm surprised to not find a word about CI workflows, that are in my opinion key to sanely growing a codebase (well in Rust there's no reason not to have them even on smaller repos, but they quickly become a necessity as more code gets added).
They will ensure that the project:
- has no warning on `main` (allowed locally, error in CI)
- is correctly formatted (check format in CI)
- has passing tests (check tests in CI, + miri if you have unsafe code, +fuzzer tests)
> This is hard to understand to me as a Rust dev, when we provide a single executable binary, but I'm not in devops and I guess they have good reason to prefer docker images.
It's mainly about isolation, orchestration and to prevent supply chain attacks.
+1 to everything here. TL;DR: predictable > most things
The development container idea however is useful when you're dealing with any type of distributed system as it allows you to develop against a known setup of those things (e.g. your database, website, api service(s), integration with external non-rust software etc.)
"Easy to learn" and "easy to hire for" are an advantage in the first few weeks. Besides, we now have data indicating that ramp up time in Rust is not longer than in other languages.
On the other hand, serving millions of users with a language that isn't even v1 doesn't seem very reasonable. The advantages of a language that is memory safe in practice and also heavily favors correctness in general boosts productivity tenfold in the long term.
I'm speaking from experience, I switched from C++ to Rust professionally and I'm still not over how more productive and "lovable" the language in general is. A language like zig isn't bringing much to the table in comparison (in particular with the user-hurting decisions around "all warnings are errors, period")
> The advantages of a language that is memory safe in practice and also heavily favors correctness in general boosts productivity tenfold in the long term
As someone who has some large projects in C++ and co tribute to OSS C++ projects I find this isn't true. The big "productivity" boost I saw when using rust for some projects was that there were good rust wrappers for common libraries or they were rewritten in rust.
In C++ using the C API directly is "good enough" but because there is no nice wrapper development is slower than it should be and writing the wrapper would be significantly slower unless you expect it to be a decades long project.
When I'm not needing to build my own abstractions above a C library in C++ I find it just as productive as rust and the moment I need to touch a C library in Rust I feel even less productive than C++.
There is definitely an argument to be made about correctness in large teams being beneficial in the long run but clearly very large projects are able to keep some sanity around keeping developers in check but this is the one metric where rust has a leg up on C++ and big backers of C++ agree that it's the place C++ is sorely lacking. Every other metric isn't worth discussing unless this is fixed.
> As someone who has some large projects in C++ and co tribute to OSS C++ projects I find this isn't true.
Well, that goes contrary to my personal experience (professional dev in C++11 and up for a decade), and also to the data recently shared by Google[1] ("Rust teams are twice as productive as C++ teams"). Either your Rust is slower than average, or your C++ is faster than average. Perhaps both.
The reasons for being more productive are easy to understand. Build system and easiness to tap into the ecosystem are good reasons, but tend to diminish as the project matures. However, the comparative lack of boilerplate (compare specializing std to add a hash implementation in C++, and deriving it in Rust, rule of five, header maintenance and so on), proper sum types (let's don't talk about std::variant :(), exhaustive pattern matching, exhaustive destructuring and restructuring makes for much easier maintenance, so much that I think it tends to an order of magnitude more productivity as the project matures. On the ecosystem side, the easy access to an ecosystem wide serialization framework is also very useful. The absence of UB makes for simpler reviews.
May also be the type of projects I would reach for C++ in as well.
I generally use golang for anything "high" level. I reach for C++ when I actually need that level of control and to be honest most of the ergonomic features in Rust are much higher level than what is needed for proper systems development.
I think my big productivity issue with rust has always been the very weird hoops I need to jump through to make it do stuff I can confirm is correct but the borrow checker prevents me from doing. I can imagine many use cases where Rust would be significantly more productive than C++ but those are places I wouldn't use C++ for in the first place.
Regarding serde, yes it's amazing but also blows up compile times, I know that's rich coming from the C++ camp, but realistically it's not great. I also find rust-analyser painfully slow but that's equally true with clangd except not for speed but more that clangd still doesn't support modules 4 years after they were standardised...
There are many issues with C++, but the reality is there are many issues with any given language and the tradeoffs I need to make with C++ feel better to me than the rust tradeoffs.
And regarding the google report, was that not self reported productivity. Also on a much smaller codebase? I did say for extremely large codebases rust has some very clear advantages and even strong supporters of C++ will agree there ( see any modern Herb Sutter talk, Microsoft, or the reports from Google) but I'm pretty sure we have learnt the lesson that what works for Google or Microsoft or Meta may not work for everyone.
Just make an informed decision is my point, you have tradeoffs for each laguage and for me easy C interop is extremely important for the places I actually need C++. For the rest I use golang.
> The absence of UB makes for simpler reviews
Rust also has UB and you should still be runnig fuzzers and sanitizers on your rust code, that is true for C++. Yes rust reviews are easier, but there are tools available that should be run on CI that can catch those issues, likewise with coding standards. It's not the perfect solution but it's the one we have.
> Rust also has UB and you should still be runnig fuzzers and sanitizers on your rust code, that is true for C++.
Safe Rust doesn't have UB[1], and safe Rust is what I review 99% of the time. For unsafe modules, you should indeed be running sanitizers. Fuzzers are always good, they are also interesting for other properties than UB.
> tools available that should be run on CI that can catch those issues
Available tools have both false positives and false negatives. Careful review is unfortunately the best tool we had in C++ to nip UB in the bud, IME.
> I think my big productivity issue with rust has always been the very weird hoops I need to jump through to make it do stuff I can confirm is correct but the borrow checker prevents me from doing
Interesting, I remember having to adapt some idioms around caching and storing iterators in particular, but very quickly I felt like there wasn't that many hoops and they weren't that weird. There's a sore point for "view types" (think parsed data) that are hard to bundle with the owning data (I have my own crate to do so[2]), but other than that I can't really think of anything. Do you mind sharing some of the patterns you find are difficult in Rust but should work, in your opinion?
> [rust-analyzer and clangd]
I find there's been tons of regressions in usage in rust-analyzer recently, but IME it blows clangd out of the water. The fact that Rust has a much saner compilation model is a large contributing factor, as well as the defacto standard build system with nice properties for analysis.
clangd never properly worked on our project due to our use of ExternalProject for dependencies.
> And regarding the google report, was that not self reported productivity.
No, the recent report (presented by Lars at some Rust conf) is distinct of the blog article and is not self reported productivity. They measured the time taken to perform "similar tasks", which google is uniquely positioned to do because it is such a large organization.
> Just make an informed decision is my point, you have tradeoffs for each laguage and for me easy C interop is extremely important for the places I actually need C++. For the rest I use golang.
That's fair. I would say the tradeoff goes very far in the Rust direction, but I have strong bias against golang (I find it verbose and inexpressive, I don't like that it allows data races[3])
[1]: to be precise, if safe Rust has UB it is a compiler bug or a bug in underlying unsafe code. By safe Rust, I mean modules that don't have `unsafe` in them.
> Interesting, I remember having to adapt some idioms around caching and storing iterators in particular, but very quickly I felt like there wasn't that many hoops and they weren't that weird.
I generally didn't feel that way. I didn't need to change much of my use because O barely ran into the borrow checker. The time I did, was recursively accessing different parts of a pretty central struct but the borrow checker concidered the entire struct a borrowed object. Effectively I couldn't access multiple different fields of the struct if I have the larger struct already as a mutable borrow even though I only access one part of it. It was pretty trivial to confirm that the different functions in fact do not touch the same parts of the struct however the borrow checker simply could not do that. Maybe it's changed recently but I required an actual significant change to the code with several abstractions added when it really wasn't necessary.
I did actually agree with you regarding reviewing code to get rid of UB, I was just a little too verbose maybe. Regarding false positives and negatives with santizers and static analysis, I feel it's worth the pain for a language which I find more expressive for my use case, I'm not against usig rust, I'm against saying it's the universal solution to every problem needing to be solved when in my experience I haven't been able to reproduce that view. There are problems it solves well, I don't run into those often.
I'm also happy more data is coming out on productivity, I feel like it can only help light a fire under people on the standards committee who are indifferent to the issues of C++ to actually push to fix some of the issues which are currently solveable.
Ironically I see with your last point regarding golang that we are very different people ans thats fine. For me I would much rather lean back towards C if I can guarantee safety than the more abstract and high level rust. Honestly I am extremely intrigued by zig but until it's stable I'm not going near it.
We want different things from languages and that is fine.
> Ironically I see with your last point regarding golang that we are very different people ans thats fine. For me I would much rather lean back towards C if I can guarantee safety than the more abstract and high level rust. Honestly I am extremely intrigued by zig but until it's stable I'm not going near it.
>
> We want different things from languages and that is fine.
I just wanted to tell you that I agree. A lot of what makes people like or dislike a language seems to be down to aesthetics in its nobler meaning.
> The time I did, was recursively accessing different parts of a pretty central struct but the borrow checker concidered the entire struct a borrowed object.
Ah OK. It helps to model a borrow of a struct as a capability. If your struct is made of multiple "capabilities" that can be borrowed separately, then you better express that with a function that borrows the struct and return "view objects" representing the capabilities.
For instance, if you can `foo` and `bar` your struct at the same time, you can have a method:
> The advantages of a language that is memory safe in practice and also heavily favors correctness in general boosts productivity tenfold in the long term.
Does it though? There are many languages that fit this description, you would choose Rust if for some reason you also need good performance. However, if you heavily interop with C/C++ safety goes out the window anyway, and it probably never mattered much in the first place.
> There are many languages that fit this description
I find that in practice not, especially if you further limit that to imperative languages. Note that I mentioned memory safe AND heavily favors correctness. In that regard, Rust is uniquely placed due to its shared XOR mutable paradigm. One has to look at functional languages that completely disable mutation to find comparable tools for correctness. Allegedly, they're more niche.
> However, if you heavily interop with C/C++ safety goes out the window anyway
I find this to be incorrect. The way you would do this is by (re)writing modules of the application in Rust. Firefox did that for its parallel CSS rendering engine. I did it for the software at my previous job. The software at my current job relies on a C database, we didn't have a memory safety issue in years (never had one since I joined, actually). We have abstracted talking to the DB with a mostly safe wrapper (there are some unsafe functions, but most of them are safe), the very big majority of our code is safe Rust.
> it probably never mattered much in the first place
It does matter. First, for security reasons. Second, because debugging memory issues is not fun and a waste of time when alternatives that fix this class of errors exist.
Memory safety isn't a defining feature of rust in any way and with very very simple rules in C++ (no raw loops and not raw pointer access) you can actually just get rid of memory issues in C++.
There are also pretty easy solutions to indexing issues (reading past the end of a array for instance) which you can use at compile time or just by enabling a compiler flag to have that at least hard crash instead of memory corruption. That takes a lot of RCE vulns to simply DOS vulns which is a significant increase in "security".
Memory safety isn't the reason to use Rust. That's already available in well written C++, and many other languages. Doing it easily with a large group of people I would say is an argument for using rust. But then you need to argue why not swift or java or kotlin or golang a or any of the crop of languages coming out that also offers easy to code memory safety.
- Under the "no raw pointer" rule, how do you express view objects? For instance, is `std::string_view` forbidden under your rules? If no, then you cannot get rid of memory issues in C++. If yes, then that's a fair bit more than "no raw pointer access", and then how do you take a slice of a string? deep copy? shared_ptr? Both of these solutions are bad for performance, they mean a lot of copies or having all objects reference-counted (which invites atomic increment/decrements overhead, cycles, etc). Compare to the `&str` that Rust affords you.
- What about multithreading? Is that forbidden as well? If it is allowed, what are the simple rules to avoid memory issues such as data races?
> That's already available in well written C++
Where are the projects in well-written C++ that don't have memory-safety CVEs?
You are actually just arguing for the sake of arguing here.
Rust bases all their data structures on pointers just like C++ does, just because you cannot look behind the curtian doesn't mean they aren't there with the same issues. Use the abstractions within the rules and you won't get issues, use compiler flags and analyzers on CI and you don't even need to remember the rules.
And of the billions of lines of code are you really going to try to argue you won't find a single project without a memory safety CVE? You will likely find more than there are rust projects in total, or are we going to shift the goalposts and say they have to be popular, then define popular and prove you won't have a memory safety issue in a similarly sized Rust project. Shift the goalposts again and say "in safe rust" but then why can I not say "in safe C++" and define safe C++ in whatever way I want since the "safe" implementation of rust is defined by the Rust compiler and not a standard or specification and can change from version to version.
I've agreed already that Rust has decent use cases and if you fall into them then and want to use Rust then use Rust. That doesn't mean rust is the only option or even the best one by some measure of best.
> You are actually just arguing for the sake of arguing here.
I'm very much not doing that.
I'm just really tired of reading claims that "C++ is actually safe if you follow these very very simple rules", and then the "simple rules" are either terrible for performance, not actually leading to memory safe code (often by ignoring facts of life like the practices of the standard library, iterator and reference invalidation, or the existence of multithreaded programming), or impossible to reliably follow in an actual codebase. Often all three of these, too.
I mean the most complete version of "just follow rules" is embodied by the C++ core guidelines[1], a 20k lines document of about 116k words, so I think we can drop the "very very simple" qualifier at this point. Many of these rules among the more important are not currently machine-enforceable, like for instance the rules around thread safety.
Meanwhile, the rules for Rust are:
1. Don't use `unsafe`
2. there is no rule #2
*That* is a very very simple rule. If you don't use unsafe, any memory safety issue you would have is not your responsibility, it is the compiler's or your dependency's. It is a radical departure from C++'s "blame the users" stance.
That stance is imposed by the fact that C++ simply doesn't have the tools, at the language level, to provide memory safety. It lacks:
- a borrow checker
- lifetime annotations
- Mutable XOR shared semantics
- the `Send`/`Sync` markers for thread-safety.
Barring the addition of each one of these ingredients, we're not going to see zero-overhead, parallel, memory-safe C++. Adding these is pretty much as big of a change to existing code as switching to Rust, at this point.
> Use the abstractions within the rules and you won't get issues, use compiler flags and analyzers on CI and you don't even need to remember the rules.
I want to see the abstractions, compiler flags and analyzers that will *reliably* find:
- use-after-free issues
- rarely occurring data races in multithreaded code
Use C++ if you want to, but please don't pretend that it is memory safe if following a set of simple rules. That is plain incorrect.
A significant portion of the core guidelines you don't need to read and are already dealt with in static analysers. As I said, you don't need to know the rules, enable an analyser and you get the exact same experience as rust, your code won't pass the build step.
And am I going to tell the family of the dead person at a funeral my code failed but it's not my fault, the compiler has a bug and I couldn't confirm that myself? And don't pull the "the code is open source you can look at it" crap, I invite you to go and reason about any part of the rust compilers codebase.
In C++ I have multiple compilers all targeting the exact same standard and if in one of them my code is broken I need to explain that, I also have multiple analysers for the same thing.
Also haven't had a single segfault nor memory corruption not index error because I run santizers and analysets on CI, so all those things get caught in build or in testing just like in rust.
Is it as simple as rust, no, and I've already said that but I get the same safety guarantees and I am in control of the code. There isn't some subset of things I cannot do because the languages' built in analysers determines it too expensive to statically analyse that and they don't trust me as a developer to actually write tests and validate all the preconditions I have documented.
It's a bug postmortem about tokio and rayon. Would be interested to know if anybody else encountered similar issues with these libraries or mixing other libraries that "own" threads
The absence of that safeguard in Go is a feature. It's used when the error isn't that critical and the program can merrily continue with the default value.
Global type inference is not a positive in my book. In my experience it becomes very hard to understand the types involved if they are not explicit at systems boundaries.
I can also imagine that it must be hard to maintain, like sometimes the types must accidentally change
Being hard to maintain and having no static types at all, did not stop Python rising to conqueror the world. Type inference allows us to at least give those users the succinctness they are used to, if not the semantics. Those who like explicit types can add as many annotations as they need in OCaml.
> Those who like explicit types can add as many annotations as they need in OCaml.
They cannot add it in other people's libraries.
> did not stop Python rising to conqueror the world
I wasn't talking popularity, I was talking maintainability. Python is not a stellar example of maintainability (source: maintained the Python API for a timeless debugger for 5 years).
Python's ubiquity is unfortunate, thankfully there seems to be a movement away from typeless signatures, both with Python's gradual typing (an underwhelming implementation of gradual typing, unfortunately) and Typescript.
Does it matter that much how the internals of someone else's library are implemented? The tooling will tell you the types anyway and module interfaces will have type signatures.
There’s a trade off - was the mistake here or there? The type checker cannot know. But for those few cases you can add an annotation. Then the situation is, in the worst case, as good as when types are mandatory.
> But for those few cases you can add an annotation.
not in other people's code. My main concern is that gradual typing makes understanding other people's code more difficult.
Idiomatic Haskell warns against missing signatures[1], Rust makes them mandatory. Rather than global inference, local inference stopping at function boundaries is the future, if you ask me.
> the situation is, in the worst case, as good as when types are mandatory
The worst case is actually worse than when types are mandatory, since you can get an error in the wrong place. For example, if a function has the wrong type inferred then you get an error when you use it even though the actual location of the error is at the declaration site. Type inference is good but there should be some places (ex. function declarations) where annotations are required.
Seamless upgrades has been my dream for Meili for a while, I'm still hoping I can smuggle it with the indexing refactor itself :-)