well we all got to start somewhere i suppose. It looks like he's off to a good start and perhaps we got another coding horror in the making. One can hope right?
Zero length arrays are one of those things that C is renowned and notorious for: something that is useful at times but unpredictable because it is flirting with undefined behavior.
I usually compare C with a Formula 1 race car. It works well as long as you don't cut too many corners and when you do you end up plastered all over the road.
Yes, they should be on, in fact GCC should not do this, 'extensions' to a language that invite undefined behavior are not in line with solid software development practices. Neither is brainfuck...
I feel obliged to point out that some of those extensions a) are pretty important for large scale projects, and b) end up standardized later on anyway— eg variadic macros being added in C99.
Definitely a case to be made that the default behaviour should be what the standard says, with extensions as an opt-in. But experimental extensions also move the language forward.
Variadic macros are a special form of footgun. It is a footgun disguised as a massage device to trick you into pointing it at your foot. It also comes with an auto-trigger that will keep on shooting until the magazine is empty.
I've used them with great care and still got burned. But they are useful. The bigger issue is that they are work-arounds for things that really should have been in the language proper.
I guess what jacquesm was getting at was that instead of extending C, the world should have left C behind a long time ago and transitioned to other, safer languages.
These days there are excellent alternatives to C, but of course we also have vast codebases written in C that would be too big of an undertaking and too much of a risk to rewrite in a safer language.
There may not have been any sufficiently widespread alternative in the past in terms of developer mindshare, and sufficiently performant considering how limited the hardware was in terms of both speed and memory and storage. There is also the issue of portability and of embedded platforms, and that remains problematic to this day in some cases but in other cases the possibilities for using safer languages are already really good.
Even in cases where an alternative was known and suitable, people in the past had legacy codebases that they were working with too that they didn't want to risk rewriting in any other language. So they just kept adding to the code that they had.
The legacy-code that we are sitting on now is so many lines of code that one might wonder, was it even justified that they didn't want to take on the risk of rewriting the code in the past, while there still was a chance to do it all in one go? Maybe, maybe not.
Either way, we can do nothing about the past, but if we keep growing the legacy codebases, the problem only gets bigger in the future.
We should not strive to rewrite it all at once, but we should use safer languages when we extend our codebases, and we should use safer languages when we start new projects.
Every line of code that we write today adds to the amount of code that will make up the legacy code of tomorrow.
Let's strive to make the legacy code of the future safer, so that our children may run their societies on safer software!
I'm somewhere in the middle. I love my C compiler. At the same time I recognize the limitations of a lanuage that is now well in it's 5th decade.
The problem is that we have so much tooling and of such high quality that it is hard to switch to anything else. For myself mostly because of habit, existing libraries and speed of compilation. The edit-compile-test cycle length is a very high factor in my productivity. Go would score high on that list, other languages not so good. The stuff I do is for myself for the most part so I don't particularly care about security but in an environment where security is important with my present day knowledge of the pitfalls of C it would likely be the last language I would pick, and that is including the recognition that there are far more ways to create security holes than just memory safety, something proponents of other languages sometimes overlook.
C is a very sharp knife, it allows you to cut your fingers off in a very smooth and painless way. You'll likely realize it when you faint from bloodloss. At the same time, it is still, in spite of all those shortcomings my usual tool of choice. Simple, no huge superstructure, no whole eco system and mountains of dependencies rammed down my throat when I don't need them.
> in an environment where security is important with my present day knowledge of the pitfalls of C it would likely be the last language I would pick
Note however that even the those languages promoted as "safer" and "better" at the end typically use the C implementations of cryptographic routines (or avoid their own "safe" rules exactly when producing such a code which then ends up being "just C in disguise"). I see then some "wrappers" but at the end... it's still C (or assembly) that does the job. As far as I'm aware, nobody managed to produce both the "safety" and everything else necessary to the point to be the "best" solution for real-life use cases.
> or avoid their own "safe" rules exactly when producing such a code which then ends up being "just C in disguise"
The point with being able to do this though is that sometimes you really do need to use unsafe code, but you still get to isolate the unsafe parts of your codebase from the safe parts of it, and you do so in a way that is defined by the language itself rather than in an ad-hoc way.
The language and the compiler enforces safety for the rest of your codebase, which in most cases makes up the vast majority of the codebase, and for the unsafe parts of the codebase where it doesn’t, you have a much more limited and clearly defined surface of code that you and everyone else looking at the code will know that needs to be handled with extra care, and which can and should be audited extra thoroughly.
And in the specific case of cryptographically secure code, you may well need to be paying attention to extremely low level details like the number of instructions being executed on different branches, the states of various buffers and caches, etc. It may well be that it doesn't make sense to expose control over these things to the high level layer where it's irrelevant 99% of the time.
I bet that's almost entirely based on preferring the devil you know. Trying to write C without side channels is a ridiculously fragile affair. Many high level languages aren't suited to it, but that doesn't mean C is suited to it. Nobody should be using C for that kind of code either. Assembly maybe. The good answer is a language that can handle arrays safely and also make guarantees about what's actually executed.
>one of those things that C is renowned and notorious for: something that is useful at times but unpredictable because it is flirting with undefined behavior.
I wonder if Brainfuck wins by his 100th of a second since he already called a program with words.txt, and therefore it was already loaded into memory, thus saving just a slice of time, as it would already be in the cache.
To avoid the caching bias, I think a fairer test condition would be to use multiple files and do it over a number of times. The lowest average time would be the winner.
The article only listed user and sys time usage, so technically that doesn't matter.
That being said, in my experience those timings are not very accurate at this time scale; timing a program which takes just a few tenths to execute may report halving or doubling the user time between subsequent runs (and vice versa for systime).
Is the 85MB test file available? I wasn't able to find it. I'd like to test this result against nerve [0], a brainfuck to x86_64 asm compiler I wrote (like funkicrab, it is also written in Rust).
Found words.txt with 466551 words, so replicating it 28 times gets it to the OP's test size. Bff is ~5x slower than 'wc -w' ... which is not unexpected, but still slightly disappointing.
Given the humorous setup and experiment report, I would bet one second of my life that maybe the guy ran the test with a few random files passing by, and mentioned only the one that just happened to be slightly faster in brainfuck.
I did several estimations of How-Many-of-Us-are-Here and I got ~1M users by various methods (eg number of upvotes using Reddit engagement numbers), so I'll take your number with a happy confirmation bias.
> Following on the recent “faster than wc” blogposts, I decided to end this fad once and for all, using the best language ever created : Brainfuck.
This "Go faster than C" post was a complete joke, one could very easily write a "C faster than C" the exact same way to prove the absurdity of an optimized implementation that does not do what the original program does.
Oh good. Though it's a shame that the test suite is a `test.py` file instead of `cargo test`; it would have been extremely satisfying to know that the Rust project would run the test suite for an INTERCAL compiler on a regular basis as part of Crater.
> Since cells need to be 32-bit for counting more than 255 words, you’ll also need to replace the few occurrences of char to int.
Well, part of the fun of Brainfuck is working around the limit of 8-bit cell size. How about creating a version which implements 32-bit numbers by storing them in four cells?
It is a great parody on latest trends in "my favorite xy language is faster than 'c/c++", each time I see one of those, it sends shivers down my spine.
Your language, that was made in c/c++, can hardly be faster than the language it was written in. Whatever optimizations it has, you can still make them in c/c++ with enough knowlidge, but probably you can optimize it some more (staring at c++ template metaprogramming) or use __asm or execute opcodes directly (cheating :D) The only question here is how good can the "compiler" be in making optimal cpu instructions from "your" language.
Stop this evangelist wars, your language can be great due to some other features (ease of use, knowlidge needed to be proficient in it, forgiveness of mistakes,..), you dont need to compare it to c/c++. It just doesnt make any sense.
What's funny is that C was initially written in B, and C smoked the living hell out of that language. As it turns out, the speed of compiled code and the running time performance of compilation is two drastically different things.
If C++ was a runtime language it would be slower than a checkout line at Walmart filled with grandma's trying to get in the last bit of Christmas shopping after all of them had just found out their kids are finally going to come out and bring their grand babies to visit for the first time after all these years.
In other words, the way you measure speed and the way I measure speed are to different things. If you had to account for the time it took to grow food back in the old days when accounting for how long it takes you to make dinner, you could say that standing behind a line of grandmas is way quicker.
But both realities are not only mutually exclusive but if you were to ask me; one of those two situations is more gratifying and the other is absolutely aggravating.
Waiting on C++ to compile feels like waiting on a line of grandmas at walmart just so I can get something to eat.
This would be "equivalent" but as jit has to process it first (overhead) and code here is just executing binary code, it will be faster. I am not talking here about readability, how much knowlidge is behind etc.
int main(int argc, char argv)
{
int (func)();
func = (int ()()) code;
(int)(*func)();
return 0;
}
This whole sharade of how fast the language is a nonsense. There are other metrics that are more important TODAY, the industry needs languages that are maintainable, can be used by cheap workforce that is simple to find anywhere and dont need to know much about computers, memory,... but rather about a problem. Why picking on c/c++ which is not solving any of those problems is such a trendy topic today, is beyond my understanding.
I have written the (bogus) code in assembler and executed it using C, there is nothing compiled about it. Without any dynamic information as there is nothing dynamic about it. There is nothing to outperform/optimize it will run at max speed on specific architecture. Compiler is only involved to do a 'call'. Same as with JIT. But hand optimized.
Let me repeat it, stop this stupid evangelist wars. They are nonsense. Who cares about C speed when the lasagna with spaghetti frameworks are run on daily bases. Speed is irelevant today for almost all cases as no one is prepared to pay for it. And nobody cares about those left.
I am bemused by C and C++ fanatics who simply do not and apparently can not see beyond C and C++. Like it is impossible for society to surpass those languages, and that everyone will use them a million years from now...
We will obviously still have them for a while, but anyone that thinks we have reached maturity in an industry that is around 50 years old is just not thinking objectively.
> Stop this evangelist wars, your language can be great due to some other features (ease of use, knowlidge needed to be proficient in it, forgiveness of mistakes,..), you dont need to compare it to c/c++. It just doesnt make any sense.
What if the purpose of the language is exactly to be faster than reasonably written C? Then a comparison makes sense.
Also, I have written a language that is "faster than C" in a certain naive sense, and it was not written in C. It does generate C, but not the kind of C sane humans like to write.
That does not mean those wc comparisons are indicative of real performance, though. They are just meant as provocative ways to get attention.
>Your language, that was made in c/c++, can hardly be faster than the language it was written in.
You're making the mistake of associating the performance of a program to be purely be a product of the language. It is not difficult to write inefficient code in any language.
Thanks for sharing, I got a good chuckle out of it.