The related post on performance optimization is extremely interesting, in particular, the considerations drawn while moving from the unsafe ported code to safe¹:
> The first performance issue we hit was dynamic dispatch to assembly, as these calls are very hot. We then began adding inner mutability when necessary but had to carefully avoid contention. We found as we removed pointers and transitioned to safe Rust types that bounds checks increasingly became a larger factor. Buffer and structure initialization was also an issue as we migrated to safe, owned Rust types.
Based on their conclusions², each of those issues amounts to a few percentage points (total: 11%).
Based on the article, it seems that with highly complex logic, safety does come at the cost of raw performance, and it can be very hard to compensate (withing the safety requirements).
> Based on the article, it seems that with highly complex logic, safety does come at the cost of raw performance, and it can be very hard to compensate (withing the safety requirements).
In Rust. These are Rust issues, not issues with safety in general.
The issue with bound checks for exemple is entirely avoidable if you prove that all your calls are within bounds before compiling, same thing for partial initialization.
The core issue is that the strategies Rust adopts to ensure memory safety are neither a panacea nor necessarily the right solution in every case. That being said, I think it's a very nice idea to try to write a decoder in Rust and have a bounty for optimization. Rust is popular so work on producing fast and safe Rust is good.
> The issue with bound checks for exemple is entirely avoidable if you prove that all your calls are within bounds before compiling, same thing for partial initialization.
The situation is more nuanced. The article dedicates a section to it:
> The general idea in eliding unnecessary bounds checks was that we needed to expose as much information about indices and slice bounds to the compiler as possible. We found many cases where we knew, from global context, that indices were guaranteed to be in range, but the compiler could not infer this only from local information (even with inlining). Most of our effort to elide bounds checks went into exposing additional context to buffer accesses.
I think the comment you replied to is about a different approach: writing the algorithm in assembly or C, and then proving it doesn't access anything out of bounds.
This is the WUFFS approach, for example. Formal verification, while very cool, is extremely tedious, which is why more practical languages try to find compromise rules that are simpler to implement, while also not too limiting.
>The issue with bound checks for exemple is entirely avoidable if you prove that all your calls are within bounds before compiling, same thing for partial initialization.
You can't always know the bounds at compile time, though.
If the compiler is smart enough and the code is written in the right way, the compiler should be able to omit a lot of bounds check. This is much easier to achieve in Rust (safely) because of noaliasing by default.
Bound checks are entirely avoidable if you can prove that all your calls are within bounds - this being the key part of my comment.
If you can’t, you just rewrite your code so you can because your code is not safe otherwise and safe code is what we are discussing here.
Note that all calls can trivially be made provably within bounds by explicitly bound checking yourself and providing an alternate path for out of bound calls which obviously doesn’t improve performances compared to bound checks inserted by the compiler but at least save you a panic in the worst case scenario.
> Based on the article, it seems that with highly complex logic, safety does come at the cost of raw performance
You cannot draw general conclusions by the experiences of one project one time. There are too many confounding factors. There are also projects who have experienced the opposite.
$20K sounds very low for the effort and expertise that are demanded here in my opinion. It would be quite a steal to bring this to the same level as the state of the art (which, correct me if I'm wrong, but I believe is dav1d?) for only that sum.
My reading is this isn't for random engineers but experienced performance engineers with optimized toolchains who can tackle this efficiently. For someone with the right setup, this is likely straightforward work.
It's similar to job descriptions written for specific candidates they plan to hire. The question shouldn't be "why is this bounty so low?" but "what toolchain makes $20K reasonable for someone?"
Performance work in my experience has been largely organizational friction: running projects in various conditions, collecting evidence maintainers accept, maximizing that limited slice of attention people give my CLs, getting compiler improvements merged. These coordination tasks have become much easier to automate with LLMs (e.g., analyzing maintainer comments to understand cryptic 1 line feedback, what are they actually looking for, what do they mean).
My guess is there's an engineer who's either optimized their feedback cycle to under an hour through specialized tooling (more arrows) or is much better at finding the right signal the first time (more wood). I'd like to understand what tools enable that efficiency.
Absolutely. If you don't know dav1d, it's easy to overlook the complexity here.
There is a reason for this sentence:
> « The dav1d and rav1d decoders share the exact same low-level assembly code optimizations—you cannot modify this assembly ».
So, it, kind of, makes the work easier, but it stills very complex I think.
The reason for that disclaimer should be obvious, any improvements to the assembly code would also benefit dav1d in c, therefore the rust code still remains worse off by comparison.
I’m sure both the dav1d and dav1d communities would appreciate improvements to the assembly code, but the goal of this contest is to improve the rust implementation only.
The sort of bounty + interview question I would ask when I'm really looking for those who will actually use such optimal algorithms in practice where it directly applies.
Which is exactly this for video decoders.
In fact, a bounty will show us who really is serious or not and even some interviewers don't have a clue why they ask for 'optimal algorithms' in the first place. Maybe just to show off that they Googled or used ChatGPT for the answer before the interview.
Well, you can't do it for this one. The one who improves this and can explain their changes will have a impressive show of expertise which puts lots of so-called 'SWEs' to shame.
> The one who improves this and can explain their changes will have a impressive show of expertise which puts lots of so-called 'SWEs' to shame.
Software is a means of automation that's used in every industry of man. Someone not knowing a particular, very specific, problem space isn't something to be ashamed of.
I maintain an open-source project that could use some new contributors. We have bounties ranging from $50 to $100. Would that be attractive in this platform?
> The contest is open to individuals or teams of individuals who are legal residents or citizens of the United States, United Kingdom, European Union, Canada, New Zealand, or Australia.
So most countries where putting in the effort would actually be worth the bounty offered is a no go...
For the purpose of an experiment, I would love to see $20k also offered to eek out more performance on the dav1d decoder, otherwise this is just a measure of how much money people are willing to pour into optimisations.
Your comment reads like the school room dilemma of if your mom gives you cookies to share with your friends at lunch she better bring enough for everyone.
This is being done to prove a point, that rust is just as efficient as c while being memory safe.
Your idea is misaligned with the goals of the contest and more a moral complaint on “fairness”. Where do you get this notion that any entity offering their own funds in a contest needs to distribute it evenly?
Code bounties are unethical. I will absolutely die on this hill. You benefit from everyone's work, yet only one (or a select few) maybe get paid (in this case a pitiful amount, considering the required expertise).
Almost universally, people attempting a bounty are in a dire financial situation. You’re just taking advantage of them.
I’m sitting at my desk doing 133t code challenges for free right now. The idea of optimizing something that both matters and could result in a payout of any amount sounds appealing given my context.
I think you underestimate the amount of people willing to do things for free or for the sake of knowledge itself.
And before you suggest that must mean I’m in a dire financial situation, I’m not.
It's not that simple to do a bounty program, my uninformed guess is that they are almost definitely targeting a number of jurisdictions whose laws are familiar with and/or they have some kind of representive in
This is the correct answer (we run this bounty). Contests can be legally complex, there are only so many place we feel comfortable running it from a legal POV.
you can use Algora.io (it’s open source) to cover 120+ countries for the bounty payout - it would be a fantastic showcase on our website (founder here)
It says nothing about "asian people". Verbatim quote, in full:
> The contest is open to individuals or teams of individuals who are legal residents or citizens of the United States, United Kingdom, European Union, Canada, New Zealand, or Australia.
Interestingly if you follow through to the full T&C's [1], they add exclusions:
> ...not located in the following jurisdictions: Cuba, Iran, North Korea, Russia, Syria, and the following areas of Ukraine: Donetsk, Luhansk, and Crimea.
Showing that the only explicit exclusions are aimed at the usual gang of comprehensively sanctioned states.
Still doesn't explain why the rest of the world isn't in the inclusions list. Maybe they don't want to deal with a language barrier by sticking to the Anglosphere... plus EU?
It'll likely be to do with financial responsibility due to where the funding comes from. They have an obligation to check that they are not sending funds to a terrorist group to solve code bounties, etc.
Actually, I'm more shocked that somehow Québec residents are eligible. Knowing how contest rules work, Québec is usually excluded because of its onerous rules. For one, the rules are not (also) written in French - which is a requirement for contests.
As a resident of Japan, I thought the exact same thing. (I'm also a citizen of an EU country, which would permit me to participate, but most of my colleagues couldn't.)
> Our Rust-based rav1d decoder is currently about 5% slower than the C-based dav1d decoder (the exact amount differs a bit depending on the benchmark, input, and platform). This is enough of a difference to be a problem for potential adopters
I'm really surprised that a 5% performance degradation would lead people to choose C over Rust, especially for something like a video codec. I wonder if they really care or if this is one of those "we don't want to use Rust because of silly reasons and here's are reasonable-sounding but actually irrelevant technical justification"...
Developers fight tooth and nail to get every bit of performance from video codecs because this goes directly to battery life and heat on a scale of billions of devices. You can't handwave a 5% performance drop as if this is some recipe app. People pour over microamp power analyzers and high resolution thermographs because they "really care."
> I'm really surprised that a 5% performance degradation would lead people to choose C over Rust
I'm really surprised that because something is in Rust and not in C, it would lead people to ignore a 5% performance degradation.
Seriously... when you get something that's 5% faster especially in the video codec space, why would you dismiss it just because it's not in your favorite language... That does sound like a silly reason to dismiss a faster implementation.
> just because it's not in your favorite language.
Kind of a strawman argument though. The question is, is the 5% difference (today) worth the memory safety guaranties? IE, would you be OK if your browser used 5% more power displaying video, if it meant you couldn't be hacked via a memory safety bug.
I can agree on the strawman but parent I responded to was mentioning "silly reasons" for not choosing a Rust implementation over a C one. A 5% performance difference in that space is anything but a silly reason.
Also glancing over the implementation of rav1d, it seems to have some C dependencies, but also unsafe code in some places. This to me makes banging the drum of memory safety - as it is often done whenever a Rust option is discussed, for obvious reasons since it's one of the main selling point of the language - a bit moot here.
You're saying pushing the memory safety improvements is moot because they have only reduced the unsafe code of the whole library to 10 or so cases where the reason is documented next to the block? (There are open PRs for reducing that too) Not worth banging the drum of memory safety until they reach 100%? That's literally letting the perfection get in the way of huge improvements.
I would take the hit. It's irrelevant. I personally am forced to work with security placebo software that causes a 20x slow down for basic things. Something that should take seconds takes minutes and nobody is arguing about even making it 1% faster.
I may be wrong but if you're one of the "big guys" doing video then a 5% performance difference probably translates into millions of $ in the CPU/GPU bill
1. Desktop - If both implementations run the same but one is faster, you run the faster one to stop the decode spluttering on those borderline cases.
2. Embedded - Where resources are limited, you still go for the faster one, even if it might one day leas to a zero day because you've weighed up the risk and reducing the BOM is an instant win and trying to factor in some unknown code element isn't.
3. Server - You accept media from unknown sources, so you are sandboxed anyway. Losing 5% of computing resources adds up to big $ over a year and at enough scale. At Youtube for example it could be millions of dollars a year of compute doing a decode and then re-encode.
Some other resistances:
1. Energy - If you have software being used in many places over the world, that cost saving is significant in terms of energy usage.
2. Already used - If the C implementation is working without issue, there would be high resistance to spend engineering time to put a slower implementation in.
3. Already C/C++ - If you already have a codebase using the same language, why would you now include Rust into your codebase?
4. Bindings - Commonly used libraries use the C version and are slow to change. The default may remain the C version in the likes of ffmpeg.
> 3. Server - You accept media from unknown sources, so you are sandboxed anyway. Losing 5% of computing resources adds up to big $ over a year and at enough scale. At Youtube for example it could be millions of dollars a year of compute doing a decode and then re-encode.
I wish big tech had to pay all the electron garbage they produce
> I wonder if they really care or if this is one of those "we don't want to use Rust because of silly reasons and here's are reasonable-sounding but actually irrelevant technical justification"...
I would have thought video decoders are specifically one of the few cases where performance really is important enough to trump language guaranteed security. They're widely deployed, and need to work in a variety of environments; everything from low power mobile devices to high-throughput cloud infrastructure. They also need to be low latency for live broadcast/streaming.
That's not to say security isn't a concern. It absolutely is, especially given the wide variety of deployment targets. However, video decoders aren't something that necessarily need to continually evolve over time. If you prioritize secure coding practices and pair that with some formal/static analysis, then you ought to be able to squeeze out more performance than Rust. For example, Rust may be inserting bounds checks on repeated access — where as a C program could potentially validate this sort of information just the once up front and pass the "pre-validated" data structure around (maybe even across threads) "knowing" that it's valid data. Yes, there's a security risk involved, but it may be worth it.
Not your parent, but video codecs are basically handling untrusted input from a user, and are therefore the sorts of programs that have a good justification for increased safety.
You're also right that performance is paramount. That's why it's non-trivial.
I think latency sensitive applications will usually prefer better performance and deal with safety issues, as opposed to better safety and deal with performance issues.
So I doubt it's any religious thing between c and Rust.
It's also a big difference in your environment. A desktop with spare power, decoding a video from an untrusted source? You probably want the safe version.
A farm of isolated machines doing batch transcoding jobs? Give me every single % improvement you can. They can get completely owned and still won't be able to access anything useful or even reach out to the network. A crash/violation will be registered and a new batch will get a clean slate anyway.
I think you're taking the term "safety" in rust a bit too literally. It's got bounds checking in it man. That's all. You can also write totally safe programs in C, or if you really want to be serious about it, write the program in F and use formal verification like this crypto library does: https://github.com/hacl-star/hacl-star
It's got bounds checking, lifetimes, shared access checks, enforced synchronisation, serious typed enums (not enforced but still helpful), explicit overflow behaviour control, memory allocation management, etc. etc. to help with safety. Far from "that's all".
> You can also write totally safe programs in C,
No you can't beyond trivial levels in practice. Even super human exceptions like DJB made mistakes and statistically nobody here is even close to DJB.
> use formal verification like this crypto library does
"Beware of bugs in the above code; I have only proved it correct, not tried it." -D.Knuth
(That is - you can make mistakes in the spec too - see how many issues the verified sel4 had https://github.com/seL4/seL4/blob/master/CHANGES.md )
This pays for at most a week of work. I doubt it is worth anyones time to do unless they would do it for free anyway. Between the risk that someone else does it first and gets the reward and that if you are trying to make a living you need to spend time finding the next thing it just isn't much.
if you can fund someone for at least 6 months of work it becomes reasonable to work for these.
Edit: Looks like many people are not understanding how overhead works. Your take home pay over a year is just over $100,000 since you end up so much unpaid time looking for the next gig.
You didn't account for overhead. Your take-home pay from projects like this is around $120,000 - you can do much better elsewhere if you are any good just getting a full time developer job in the midwest. (The Bay or senior level positions pay more)
Sure when you work you make a lot of money, but you end up needing to spend the vast majority of your time looking for the next gig and that is all unpaid.
My first-hand Eastern European experience tells me that you should refresh your expectations. €50..60k is barely within the range of acceptable for a mid-senior Rust developer. You'd have to throw in quite some perks, like 100% remote work to lure someone to work for this money.
> The first performance issue we hit was dynamic dispatch to assembly, as these calls are very hot. We then began adding inner mutability when necessary but had to carefully avoid contention. We found as we removed pointers and transitioned to safe Rust types that bounds checks increasingly became a larger factor. Buffer and structure initialization was also an issue as we migrated to safe, owned Rust types.
Based on their conclusions², each of those issues amounts to a few percentage points (total: 11%).
Based on the article, it seems that with highly complex logic, safety does come at the cost of raw performance, and it can be very hard to compensate (withing the safety requirements).
[¹]: https://www.memorysafety.org/blog/rav1d-performance-optimiza...
[²]: https://www.memorysafety.org/blog/rav1d-performance-optimiza...
reply