Hacker News new | past | comments | ask | show | jobs | submit login
Beating C with one line of Brainfuck (kiwec.net)
567 points by pmontra on Nov 24, 2019 | hide | past | favorite | 86 comments



I love how this blog only has this single post and it's dutifully tagged #shitpost.

Thanks for sharing, I got a good chuckle out of it.


> and it's dutifully tagged #shitpost

This should be mandatory for all "beating X in Y lines of Z" format posts.


well we all got to start somewhere i suppose. It looks like he's off to a good start and perhaps we got another coding horror in the making. One can hope right?


Generated C:

  #include <stdio.h>
  
  char mem[30000] = {  };
  
  #define OUTPUTS 0
  char outputs[OUTPUTS] = {  };
  
  int main() {
      char* ptr = mem + 0;
      fwrite(outputs, sizeof(char), OUTPUTS, stdout);
  
      ptr[2] = getchar();
      ++ptr[2];
      while (ptr[2]) {
          ptr[2] -= 11;
          while (ptr[2]) {
              ptr[2] -= 22;
              while (ptr[2]) {
                  ++ptr[3];
                  --ptr[1];
                  while (ptr[1]) {
                      ++*ptr;
                      ++ptr[1];
                  }
                  ptr[2] = 0;
              }
              ptr[2] = 0;
          }
          ptr[1] = 0;
          ptr[1] += ptr[3] * 1;
          ptr[3] = 0;
          ptr[2] = getchar();
          ++ptr[2];
      }
      ptr[1] = 0;
      ptr[3] += *ptr * 1;
      ptr[2] += *ptr * 1;
      *ptr = 0;
      *ptr += ptr[3] * 1;
      ptr[3] = 0;
      ++ptr[1];
      while (ptr[2]) {
          --ptr[1];
          while (ptr[2]) {
              ptr[3] += 10;
              while (ptr[2]) {
                  --ptr[2];
                  --ptr[3];
                  while (ptr[3]) {
                      ++ptr[4];
                      ptr += 3;
                  }
                  while (ptr[4]) {
                      ++ptr[4];
                      ptr[3] += ptr[4] * 1;
                      ptr[4] = 0;
                      ++ptr[5];
                      ptr += 3;
                  }
                  ptr -= 3;
              }
              ptr[3] = 8;
              ptr[2] += ptr[3] * 6;
              ptr[3] = 0;
              ptr[2] += ptr[4] * 1;
              ptr[4] = 0;
              ptr[3] += ptr[5] * 1;
              ptr[5] = 0;
              ++ptr;
          }
          ++ptr;
      }
      while (ptr[1]) {
          --ptr[1];
          ptr[3] += 8;
          ptr[2] += ptr[3] * 6;
          ptr[3] = 0;
          ptr += 2;
      }
      while (ptr[0]) {
          putchar(ptr[0]);
          *ptr = 0;
          --ptr;
      }
      ptr[1] += 10;
      putchar(ptr[1]);
  
      return 0;
  }


Note that:

    #define OUTPUTS 0
    char outputs[OUTPUTS] = {  };
is not valid C (or C++) - zero length arrays are not a thing in either language,


  (base) jam@jam-XPS-8500:~/svn/src/music/mp3tomidi$ cc -c -Wall x.c
  (base) jam@jam-XPS-8500:~/svn/src/music/mp3tomidi$ cat x.c

  #define OUTPUTS 0
  char outputs[OUTPUTS] = {  };
Looks like you're wrong about that; at least for GNU C. It's not official C but if it works and generates no warnings it will be used too.

Long HN thread about this subject:

https://news.ycombinator.com/item?id=11674374

Zero length arrays are one of those things that C is renowned and notorious for: something that is useful at times but unpredictable because it is flirting with undefined behavior.

I usually compare C with a Formula 1 race car. It works well as long as you don't cut too many corners and when you do you end up plastered all over the road.


     [neilb@Miranda temp]$ gcc -Wall -Wextra -pedantic a.c                  
      a.c:2:6: warning: ISO C forbids zero-size array   'outputs' [-Wpedantic] 
 char outputs[OUTPUTS] = {  };


Yes, the -pedantic is what will trigger that. See linked post. But that's not on by default (fortunately!).

ISO C is so restrictive ;)


> (fortunately!)

Unfortunately. All warnings and errors should be on by default. That they are not is one of the things about GCC that really pisses me off.


Not all warnings. Try clang -Weverything (that enables all warnings) sometime. Some warnings are even contradictory!


Still not getting the joke.

Yes, they should be on, in fact GCC should not do this, 'extensions' to a language that invite undefined behavior are not in line with solid software development practices. Neither is brainfuck...


I feel obliged to point out that some of those extensions a) are pretty important for large scale projects, and b) end up standardized later on anyway— eg variadic macros being added in C99.

Definitely a case to be made that the default behaviour should be what the standard says, with extensions as an opt-in. But experimental extensions also move the language forward.


Variadic macros are a special form of footgun. It is a footgun disguised as a massage device to trick you into pointing it at your foot. It also comes with an auto-trigger that will keep on shooting until the magazine is empty.

I've used them with great care and still got burned. But they are useful. The bigger issue is that they are work-arounds for things that really should have been in the language proper.


I guess what jacquesm was getting at was that instead of extending C, the world should have left C behind a long time ago and transitioned to other, safer languages.

These days there are excellent alternatives to C, but of course we also have vast codebases written in C that would be too big of an undertaking and too much of a risk to rewrite in a safer language.

There may not have been any sufficiently widespread alternative in the past in terms of developer mindshare, and sufficiently performant considering how limited the hardware was in terms of both speed and memory and storage. There is also the issue of portability and of embedded platforms, and that remains problematic to this day in some cases but in other cases the possibilities for using safer languages are already really good.

Even in cases where an alternative was known and suitable, people in the past had legacy codebases that they were working with too that they didn't want to risk rewriting in any other language. So they just kept adding to the code that they had.

The legacy-code that we are sitting on now is so many lines of code that one might wonder, was it even justified that they didn't want to take on the risk of rewriting the code in the past, while there still was a chance to do it all in one go? Maybe, maybe not.

Either way, we can do nothing about the past, but if we keep growing the legacy codebases, the problem only gets bigger in the future.

We should not strive to rewrite it all at once, but we should use safer languages when we extend our codebases, and we should use safer languages when we start new projects.

Every line of code that we write today adds to the amount of code that will make up the legacy code of tomorrow.

Let's strive to make the legacy code of the future safer, so that our children may run their societies on safer software!


I'm somewhere in the middle. I love my C compiler. At the same time I recognize the limitations of a lanuage that is now well in it's 5th decade.

The problem is that we have so much tooling and of such high quality that it is hard to switch to anything else. For myself mostly because of habit, existing libraries and speed of compilation. The edit-compile-test cycle length is a very high factor in my productivity. Go would score high on that list, other languages not so good. The stuff I do is for myself for the most part so I don't particularly care about security but in an environment where security is important with my present day knowledge of the pitfalls of C it would likely be the last language I would pick, and that is including the recognition that there are far more ways to create security holes than just memory safety, something proponents of other languages sometimes overlook.

C is a very sharp knife, it allows you to cut your fingers off in a very smooth and painless way. You'll likely realize it when you faint from bloodloss. At the same time, it is still, in spite of all those shortcomings my usual tool of choice. Simple, no huge superstructure, no whole eco system and mountains of dependencies rammed down my throat when I don't need them.


> Simple, no huge superstructure, no whole eco system and mountains of dependencies rammed down my throat when I don't need them.

This is my main reason for using C too.


> in an environment where security is important with my present day knowledge of the pitfalls of C it would likely be the last language I would pick

Note however that even the those languages promoted as "safer" and "better" at the end typically use the C implementations of cryptographic routines (or avoid their own "safe" rules exactly when producing such a code which then ends up being "just C in disguise"). I see then some "wrappers" but at the end... it's still C (or assembly) that does the job. As far as I'm aware, nobody managed to produce both the "safety" and everything else necessary to the point to be the "best" solution for real-life use cases.


> or avoid their own "safe" rules exactly when producing such a code which then ends up being "just C in disguise"

The point with being able to do this though is that sometimes you really do need to use unsafe code, but you still get to isolate the unsafe parts of your codebase from the safe parts of it, and you do so in a way that is defined by the language itself rather than in an ad-hoc way.

The language and the compiler enforces safety for the rest of your codebase, which in most cases makes up the vast majority of the codebase, and for the unsafe parts of the codebase where it doesn’t, you have a much more limited and clearly defined surface of code that you and everyone else looking at the code will know that needs to be handled with extra care, and which can and should be audited extra thoroughly.


And in the specific case of cryptographically secure code, you may well need to be paying attention to extremely low level details like the number of instructions being executed on different branches, the states of various buffers and caches, etc. It may well be that it doesn't make sense to expose control over these things to the high level layer where it's irrelevant 99% of the time.


I bet that's almost entirely based on preferring the devil you know. Trying to write C without side channels is a ridiculously fragile affair. Many high level languages aren't suited to it, but that doesn't mean C is suited to it. Nobody should be using C for that kind of code either. Assembly maybe. The good answer is a language that can handle arrays safely and also make guarantees about what's actually executed.


>one of those things that C is renowned and notorious for: something that is useful at times but unpredictable because it is flirting with undefined behavior.

This describes all the stuff that makes C useful.


Isn't countering the G-force when cornering to keep your head straight the equivalent of lifting 25kgs with your neck?

C sounds like one hell of a ride.


what it is being used for? I can't figure what that fwrite part does


That fwrite writes the contents of output to stdout.

output probably contains all the output that doesn't depend on input (which this program doesn't have).


my eyes!


I wonder if Brainfuck wins by his 100th of a second since he already called a program with words.txt, and therefore it was already loaded into memory, thus saving just a slice of time, as it would already be in the cache.


Could also be that their version of wc is doing Unicode-aware counting by default; it could be a lot faster with LANG=C set in the environment.

https://unix.stackexchange.com/a/96580


For some reason, wc is even slower with LANG=C.

    cat words.txt | time wc -w
    13018291
    0.76 user 0.01 system 0:00.78 elapsed
That's on the default wc build of void linux, and not in a proper benchmark environment, but that's still strange.


I prefer setting LC_ALL=C for to be sure, though LANG may be sufficient, as running "LANG=C locale" confirms LC_CTYPE=C is set.


For a #shitpost, one need only observe the fastest run of program A being faster than the slowest run of program B, to declare A is faster than B.

With a sufficiently powerful static evaluator, you could embed the input into the program and get an even faster runtime.


> With a sufficiently powerful static evaluator, you could embed the input into the program and get an even faster runtime.

If you feel the need to cheat, you could do something like I used for dumb_cat [0] and dump the resulting executables into a cache somewhere.

[0] https://git.sr.ht/~shakna/dumb_cat


Wait... is that a Sourcehut user in the wild?


Of course. I love what Drew has built. I'm much happier (as a paying customer) with it as a platform than most of the others.


To avoid the caching bias, I think a fairer test condition would be to use multiple files and do it over a number of times. The lowest average time would be the winner.


You can manually clear the cache:

`echo 3 | sudo tee /proc/sys/vm/drop_caches`


But the file is loaded in cat, which isn't timed


But the timed program still has to wait for cat to load the file. Both programs are launched (almost) simultaneously by the shell.


The article only listed user and sys time usage, so technically that doesn't matter.

That being said, in my experience those timings are not very accurate at this time scale; timing a program which takes just a few tenths to execute may report halving or doubling the user time between subsequent runs (and vice versa for systime).


Is the 85MB test file available? I wasn't able to find it. I'd like to test this result against nerve [0], a brainfuck to x86_64 asm compiler I wrote (like funkicrab, it is also written in Rust).

[0]: https://github.com/JoshMcguigan/nerve


Same here.

bff [1] has no fancy assembly, not even a compiler, just some basic loop unrolling and jump precomputation, but I too wonder how it compares to wc.

[1] https://github.com/apankrat/bff/

-- edit --

Found words.txt with 466551 words, so replicating it 28 times gets it to the OP's test size. Bff is ~5x slower than 'wc -w' ... which is not unexpected, but still slightly disappointing.

https://github.com/dwyl/english-words/blob/master/words.txt?...


After editing my code to use EOF == 0, I get :

    19.55 user 21.49 system 0:41.13 elapsed
However, you're using 8-bit cells, so the resulting word count is incorrect. In case you're wondering, the file was generated with :

    python3 -m faker -o words.txt text -r [large number] -s '\n'


Given the humorous setup and experiment report, I would bet one second of my life that maybe the guy ran the test with a few random files passing by, and mentioned only the one that just happened to be slightly faster in brainfuck.

I think I got the joke, it had me chuckle anyway.


Hilarious. Hopefully this will end the endless flood of 'Beating C' posts.

https://www.google.com/search?q=site%3Anews.ycombinator.com+...


Beating a C brainfuck compiler in rust/d/etc in 3 2 1...


It seems we've beat the webserver serving this blog with one hug from HN.


Should have written the server in brainfuck, I guess.


There must be way more people reading HN than I would think.

Here we are 7 hours after post, overloading it.

Suppose "overloading" this blog is a mere 2 requests/s.

Suppose that an entire 10% of people reading HN click on this link, and on average click on it twice.

Further suppose that the average "active" HN user reads HN 5 times a week.

That means there are 1,210,000 active HN readers.


Thanks for the math!

I did several estimations of How-Many-of-Us-are-Here and I got ~1M users by various methods (eg number of upvotes using Reddit engagement numbers), so I'll take your number with a happy confirmation bias.


> Following on the recent “faster than wc” blogposts, I decided to end this fad once and for all, using the best language ever created : Brainfuck.

This "Go faster than C" post was a complete joke, one could very easily write a "C faster than C" the exact same way to prove the absurdity of an optimized implementation that does not do what the original program does.


This blog post also does that actually since the Brainfuck code is transpiled to C.


Marvelous. Anybody got an optimizing malbolge compiler?


Not an optimising one (yet) but a safe implementation of a malbolge interpreter written in Rust, which can run malbolge programs. Just for fun.

https://github.com/return/malbolge


Has anyone used Rust to write an interpreter for INTERCAL yet?



Oh good. Though it's a shame that the test suite is a `test.py` file instead of `cargo test`; it would have been extremely satisfying to know that the Rust project would run the test suite for an INTERCAL compiler on a regular basis as part of Crater.


Wash your mouth out with FROMs :)


You didn't say PLEASE.


If you're going to "beat" C with another language, it will obviously be one that compiles to C.


> Since cells need to be 32-bit for counting more than 255 words, you’ll also need to replace the few occurrences of char to int.

Well, part of the fun of Brainfuck is working around the limit of 8-bit cell size. How about creating a version which implements 32-bit numbers by storing them in four cells?


file-cache and system load jitter. That's it.


Totally missing the joke.


The HN crowd gets the joke, they just don't laugh about it. :-)


Right, especially with there being only a single run of each, inconclusive.


I want more articles like this.



85 mbyte for 0.01s, on a terabyte you could easily get a cup a coffee or tea :)


I think that BF implementation uses 32bit cells (which is not standard BF?)


Can someone summarize the article/link? I can't view it.


Brainfuck is faster than C. You should use it for all new code where performance is critical.


How about beating a whole C system with BrainCrap!


Will rust be any faster? Like 'ripgrep'.



À la première place, GG mec !


I love it!!!

It is a great parody on latest trends in "my favorite xy language is faster than 'c/c++", each time I see one of those, it sends shivers down my spine.

Your language, that was made in c/c++, can hardly be faster than the language it was written in. Whatever optimizations it has, you can still make them in c/c++ with enough knowlidge, but probably you can optimize it some more (staring at c++ template metaprogramming) or use __asm or execute opcodes directly (cheating :D) The only question here is how good can the "compiler" be in making optimal cpu instructions from "your" language.

Stop this evangelist wars, your language can be great due to some other features (ease of use, knowlidge needed to be proficient in it, forgiveness of mistakes,..), you dont need to compare it to c/c++. It just doesnt make any sense.


"Your language, that was made in c/c++..."

What's funny is that C was initially written in B, and C smoked the living hell out of that language. As it turns out, the speed of compiled code and the running time performance of compilation is two drastically different things.

If C++ was a runtime language it would be slower than a checkout line at Walmart filled with grandma's trying to get in the last bit of Christmas shopping after all of them had just found out their kids are finally going to come out and bring their grand babies to visit for the first time after all these years.

In other words, the way you measure speed and the way I measure speed are to different things. If you had to account for the time it took to grow food back in the old days when accounting for how long it takes you to make dinner, you could say that standing behind a line of grandmas is way quicker.

But both realities are not only mutually exclusive but if you were to ask me; one of those two situations is more gratifying and the other is absolutely aggravating.

Waiting on C++ to compile feels like waiting on a line of grandmas at walmart just so I can get something to eat.


> Your language, that was made in c/c++, can hardly be faster than the language it was written in

What if the language you wrote includes a run-time that does JIT?


This would be "equivalent" but as jit has to process it first (overhead) and code here is just executing binary code, it will be faster. I am not talking here about readability, how much knowlidge is behind etc.

#include <unistd.h> char code[] = "\x01\xb8\x00\x00\xbb\x00\x00\x2a\x00\x00\x80\xcd\x00";

int main(int argc, char argv) { int (func)(); func = (int ()()) code; (int)(*func)();

return 0; }

This whole sharade of how fast the language is a nonsense. There are other metrics that are more important TODAY, the industry needs languages that are maintainable, can be used by cheap workforce that is simple to find anywhere and dont need to know much about computers, memory,... but rather about a problem. Why picking on c/c++ which is not solving any of those problems is such a trendy topic today, is beyond my understanding.


> code here is just executing binary code, it will be faster

No it won't. JIT-ed code can take into account dynamic information, and therefore outperform even the best general static compiler.


I have written the (bogus) code in assembler and executed it using C, there is nothing compiled about it. Without any dynamic information as there is nothing dynamic about it. There is nothing to outperform/optimize it will run at max speed on specific architecture. Compiler is only involved to do a 'call'. Same as with JIT. But hand optimized.

Let me repeat it, stop this stupid evangelist wars. They are nonsense. Who cares about C speed when the lasagna with spaghetti frameworks are run on daily bases. Speed is irelevant today for almost all cases as no one is prepared to pay for it. And nobody cares about those left.


and the JIT is written in...


I am bemused by C and C++ fanatics who simply do not and apparently can not see beyond C and C++. Like it is impossible for society to surpass those languages, and that everyone will use them a million years from now...

We will obviously still have them for a while, but anyone that thinks we have reached maturity in an industry that is around 50 years old is just not thinking objectively.


The JIT compiler is written in C.

Its output is assembly code.

That assembly code can outperform code written in C, because it has access to dynamic run time information that your C compiler does not.


> Your language, that was made in c/c++, can hardly be faster than the language it was written in.

It doesn't work that way.


> Stop this evangelist wars, your language can be great due to some other features (ease of use, knowlidge needed to be proficient in it, forgiveness of mistakes,..), you dont need to compare it to c/c++. It just doesnt make any sense.

What if the purpose of the language is exactly to be faster than reasonably written C? Then a comparison makes sense.

Also, I have written a language that is "faster than C" in a certain naive sense, and it was not written in C. It does generate C, but not the kind of C sane humans like to write.

That does not mean those wc comparisons are indicative of real performance, though. They are just meant as provocative ways to get attention.


>Your language, that was made in c/c++, can hardly be faster than the language it was written in.

You're making the mistake of associating the performance of a program to be purely be a product of the language. It is not difficult to write inefficient code in any language.


> The only question here is how good can the "compiler" be in making optimal cpu instructions from "your" language.

> Stop this evangelist wars.

I think if there is any need for a war at all, then it should be between compilers not languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: