Hacker News new | past | comments | ask | show | jobs | submit login
A Compiler Writing Journey (github.com/doctorwkt)
156 points by davikr on Nov 29, 2022 | hide | past | favorite | 12 comments



This repo is great! Always nice to see an end-to-end compiler example for a real programming language.

Unfortunately, real programming languages are very complex, so writing compilers often gets bogged down in complexity, edge cases, and "mop up" work. Often it can be better to learn on a simpler language where you can afford more focus and more depth before things get too complex.

Most "write a compiler" tutorials end up focusing predominantly on having a working compiler rather than one that produces well-optimized code. I decided to write a blog series dedicated specifically to implementing compiler optimizations for a simpler programming language -- the one from last year's Advent of Code day 23.

It's called Compiler Adventures and every episode in the series describes and implements a new optimization, like no-op elimination (ep. 1), constant propagation (ep. 2), value numbering (ep. 3), value range analysis (ep. 4, coming out in a few days), etc.

Check it out here: https://predr.ag/blog/compiler-adventures-part1-no-op-instru...


Writing compilers is actually rather easy. The hard part is dealing with other programs. For example, last week I finally tracked a bug down to the documentation on how the function ABI worked was simply wrong.


I love this series and I am very much looking forward to episode 4 (and more)!


nice idea, very nice


Always nice to see compiler tutorials that extend beyond the common "compile a calculator to a stack machine" genre.

I've recently restarted writing a compiler for fun and I've found that the hardest part is actually psychological. There's so much stuff I want to put into the compiler and language, so I often end up a little overwhelmed. When that happens it's important to just focus on the code and get something done.

It's also very easy to get myself into a rabbit hole, i.e. I'm using tree-sitter as my parser, but tree-sitter isn't designed to be used as a compiler front end, but hmm what if it was, maybe I should write something that automatically translates tree-sitter grammars to Rust AST definitions so that I don't have to manually convert a tree-sitter CST to an AST? Sure, I could go down that path, or I could bite the bullet and manually convert the tree.

Compilers are great exercises in scoping and prioritization because there's so damn many of these rabbit holes.


If someone were looking for an entry in the "compile a calculator to a stack machine" genre, what would you recommend?


Chapters 14 and 15 from Crafting Interpreters[1] should be a good start. The entire book is worth a read if you decide to move beyond calculators.

[1] https://craftinginterpreters.com/a-bytecode-virtual-machine....


This is really impressive. Thank you for sharing.

Writing a language and compiler is a large project. I wanted to get something working fast so I started writing an switch based interpreter. I even use plain strings for instructions and a list of HashMaps for instruction arguments. I figure I can turn it into a bytecode interpreter at a later date.

My codegen for my higher level language is for the interpreter I wrote. Not amd64 or arm.

My interpreter is multithreaded and is integrated with a simple message passing actor API where each thread can have mailboxes to send data or jump instructions.

This allows threads to send messages to other threads and ask them to do things. I implemented this behaviour with a "send", "sendcode", "receive", "receivecode" instructions that either send or check a mailbox for a message and then jump to a label or PC defined by the message. This is pseudo actors. I also have a userspace lightweight scheduler that multiplexes N lightweight threads over M kernel threads and if I combined them together I could have a Golang style runtime that is true actors.

Does anybody know if any compilers are embeddable?

One of my ideas is to borrow the JIT compiler idea from HolyC and keep the AST in memory and referenceable by every expression. Then you can write code to generate or modify AST nodes.

Can GCC be run in a server mode without writing a C file?

I was thinking of writing a transpile to C that compiles and executes at runtime. How do you load an .o file for execution? You need a linker. How do you begin executing an .o file? I think you can load the data into a buffer and cast the buffer to a void function pointer and execute it. This is a pseudo JIT compiler.

https://GitHub.com/samsquire/multiversion-concurrency-contro...


> Does anybody know if any compilers are embeddable?¶ One of my ideas is to borrow the JIT compiler idea from HolyC and keep the AST in memory and referenceable by every expression. Then you can write code to generate or modify AST nodes.¶ Can GCC be run in a server mode without writing a C file?¶ I was thinking of writing a transpile to C that compiles and executes at runtime.

Sounds like you want TCC:

> Compile and execute C source directly. No linking or assembly necessary.

<http://www.tinycc.org/>

> TCCBOOT, a hack where TCC loads and boots a Linux kernel from source in about 10 seconds. That is to say, it is a "boot loader" that reads Linux kernel source code from disk, writes executable instructions to memory, and begins running it.

<https://en.wikipedia.org/wiki/Tiny_C_Compiler>


How about using TinyC Compiler [1]? Personally I use it within my Vim editor with a mapped key to take advantage the `-run` flag and also you can use it as a script with `#!` if you like.

[1] https://repo.or.cz/w/tinycc.git


Great job !

<Me-Too-Ing>

I had a similar silly lockdown project (basically learning Rust by turning Worth's "Compiler Construction" book into actual code).

I though about documenting it too, but in the end I'm so far from having anything interesting, my code is probably un-idiomatic as f..., and I'm stuck at "nested else".

Still, the little time I find to tinker with it once in a while feels like... Fun.

I don't know if there is a place to get rust code reviewed (I expect "roasted" in my case..)

I would love to know how far such "toy compilers from academic langages" are from "adult swim" compilers (no necessary clang-level, but things like jai/odin/zig which seem to be "manageable" for a small size team.)

</Me-Too-Ing>


Oh hey going to have to dive into this as compilers are a particular fascination of mine. I've been toying with the pre-steps to building a non-programming language compiler (currently just trying to make a DSL in F# to know what I do and do not want in the real compiled language so I can figure things out more quickly without accumulating unnecessary tokens/syntax/etc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: