Microsoft Build Accelerator – open-source build engine for large systems

mikece · on April 30, 2019

"BuildXL runs 30,000+ builds per day on monorepo codebases up to a half-terabyte in size with a half-million process executions per build... You may find our technology useful if you face similar issues of scale."

I know this wasn't supposed to be a humorous announcement but I couldn't help laughing out loud at that! Kudos to the managers at Microsoft who now seem to be asking "Why not?" instead of "Why should we?" when the topic of releasing code to the community is raised.

algorithmsRcool · on May 1, 2019

I'm a bit confused.

Are you critical of the size of the monorepo, or the number of builds?

nickpeterson · on May 1, 2019

I believe the joke is that very few companies are the size of Microsoft, and as such very few would find this useful.

hermitdev · on May 1, 2019

You'd be surprised at the volume of code a smaller company can produce.

Former employer was a big C++ shop in finance. Of around 1000 employees, roughly 3/4 of those were developers. They definitely could take advantage of something like this. I dont know how many 100s of million of LOC they have between C++ and later C#, but I was responsible for around 3 million alone (largely generated). A full coordinated firm wide rebuild could take weeks.

nickpeterson · on May 1, 2019

I'm always stunned by these sorts of stories. Was the opinion that the scale of code was, low, high, or appropriate given the problems being tackled?

cryptonector · on May 1, 2019

I've a similar story or three of finance companies that in just ten years produced enormous amounts of legacy code. It's really not that hard. Solaris was enormous too, with just 2k devs for all the time I was there, and about 30-40 years of history, depending on how you count it. If your 1k developers each write 10Kloc/year on average, then after a decade you can expect to have 10Mloc, but since a lot of code will be forked external open source (or even not forked, but just imported to freeze at a particular version, or for some other reason) you might find your devs building and looking after many tens more Mloc than that. If you hire lots of 5x and 10x engineers, that too leads to a sizeable increment.

There are many many companies out there that have huge megarepos.

bryanrasmussen · on May 1, 2019

Ok but, what's the byte size of 10MLoc, and how many process executions per build - since these were actually the metrics used. My experience is that lines of code don't actually take up that much space.

hermitdev · on May 2, 2019

Depends largely on how the code is structured with C++.

There's the number of compilation units within a lib vs overall. Typically you can parallelize within a module, but not externally unless you have some smarts.

Edit: I use module in this sense as a producible result, not the future language concept of modules.

hermitdev · on May 2, 2019

In my experience, it felt appropriate.

We (the developers) were tasked with enormous responsibilities. I was back office, but responsible for managing a client/server for all non security reference data. There were easily over 200 data objects modelled. No direct DB access was allowed except for the owning service. Although it ended up around 3M LOC, it wasnt as bad as it seems, because only about 10% was manually written code. A lot was generated C++ (and aside: I was able to write a C++ wrapper around a generic API that exposed some 300 types through a home grown reflection esque API amd the Python wrapper never had to be updated necause it could use this reflection and run time cose generation to generate a strongly typed API at runtime; to my knowledge, the Python code has remained unchanged since about 2006 despite the underlying C++ API changing constantly, when I first wrote it - I get occasional updates from former coworkers).

The big problem was the dependency management and scale. At least at the time I was there, neither were done well.

Scale was a problem because of tight coupling between libs. Upgrade a core lib? Everyone had to rebuild. Want to upgrade a 3rd party dependency? Firm wide rebuild that took a min 2 weeks. It was a mess. We were supposed to be client/server to minimize dependencies, but we so tightly coupled our clients to our servers, we just exacerbated the problem. A few us could handle multiple client versions with a single server, but most couldn't. Don't recommend.

oblio · on May 1, 2019

I agree with the other commenters. You'd be surprised how much code companies you've never heard of. I used to work for Axway, an enterprise middleware vendor. They had at least 3 products that I knew about with multi-million LOC codebases, in Java and C and various other languages.

I wouldn't be surprised if they had at least 50 million LOC in total. Actually, now that I think of it, 100 million was more likely. And that was almost 10 years ago...

onion2k · on May 1, 2019

I might simplify my Webpack builds with it.

hermitdev · on May 1, 2019

I think the GP is more taking a jab at MS's history of being anti-open source, which has taken an about face in the last 5 years or so.

I mean we had Ballmer under whom you'd likely never have seen anything (or only trivial, non monetizable things) open sourced and likely no Linux support.

Now under Satya, we have MS open sourcing lots of projects and embracing Linux as a first class citizen. Probably far a complete list of Linux support, off the top of my I can name: VS Code, building for Linux via VS proper, Cmake support, WSL, SQL Server on Linux, SQL server odbc drivers, .Net Core and Linux support in Azure.

Even if projects are completely open, such as WSL and connhost, they have github projects at least for bug tracking that allows end-users to directly interact with the teams responsible for those projects. Personally, I've filed several issues against WSL and I've gotten fairly quick responses and resolutions usually appear in a few weeks time (I'm in the Fast Insider's ring at home).

mikepurvis · on May 1, 2019

I'm pretty impressed with WSL. I've just moved back to Windows (Dell XPS) after ~13 years using Mac OS, and in many ways WSL is nicer than what I used to do before with juggling Homebrew packages and Parallels VMs depending on the demands of a particular task. Between that, PowerShell, and installing stuff via Chocolatey, I feel right at home.

But without things like WSL and native OpenSSH, I doubt I would even have looked; I would have stuck with MacOS forever or just gone native Ubuntu.

phunky · on May 1, 2019

You've still got access to brew via linuxbrew.

https://docs.brew.sh/Homebrew-on-Linux

I'm also in the process of switching back from Macos to Windows thanks to WSL.

Sangeppato · on May 1, 2019

I personally prefer scoop as a package manager, give it a try!

mikepurvis · on May 1, 2019

Yes, it's on my list to check out! I just haven't yet found something I wanted to install that wasn't available in chocolatey—scoop's forte appears to be CLI utilities, and I haven't gotten as far into that realm.

detaro · on May 1, 2019

I think the comment isn't critical at all.

m0zg · on May 1, 2019

Microsoft has something like 30K devs. 30K builds a day seems really, really low.

yati · on May 1, 2019

Maybe they have multiple big repos, and that number is per repo?

pianoben · on May 1, 2019

Congratulations to the BuildXL team! Domino, incidentally, was/is the internal for BuildXL; there are a few papers published where the system is described under that name. I had the privilege to see it gradually rolled out in the Office codebase over the course of a year or two. It was a massive improvement, and the lengths to which the team went to see it through are really beyond description here.

I _have_ to wonder why we didn't just pick up Bazel, which is Google's open-source distributed build engine for large systems, which also happens to have been stable for years. Perhaps its Windows support wasn't up to snuff at the time, but it feels like that would have been easier to fix that than to build a whole new build system.

Regardless, congrats again! So cool to see this out in the open.

hyperrail · on May 1, 2019

> I _have_ to wonder why we didn't just pick up Bazel, which is Google's open-source distributed build engine for large systems, which also happens to have been stable for years. Perhaps its Windows support wasn't up to snuff at the time, but it feels like that would have been easier to fix that than to build a whole new build system.

Bazel was first announced and made open-source in early 2015, but by that time Domino had already been in development for at least a year. I remember following the early internal discussions at Microsoft on the One Engineering System or "Apex" initiative which first provided the impetus for what became Domino; some of those discussions were quite animated :P

Also, while I don't know BuildXL and Bazel well enough to compare their goals and relative strengths, I will say that BuildXL was developed at Microsoft, which has historically had a great diversity of build toolchains and workflows across all its software products. Just on the toolchain side, Microsoft has many different build engines/make tools, unit test frameworks, tools and conventions to register an automated test to run on a certain schedule (for products too big to run every test on each and every build), build and test server farms, continuous integration tools to orchestrate those server farms, tools to email people and/or file bugs on test failures, and so on...

From my limited understanding, Google has never been this way at all, and so some of the lessons and ideas incorporated into BuildXL may seem strange to Bazel's designers.

dekhn · on May 1, 2019

BuildXL is more comparable to Forge, Google's internal distributed compilation tool.

jchw · on May 1, 2019

Windows support in Bazel is still far from perfect; to this day, you need Msys2 for many things to work. Still, things have been moving forward quite a bit recently, I think in part due to Angular. Honestly, it's still really nice to use imo, but I can see why it would be hard for Microsoft to adopt.

(I work at Google and have contributed a very tiny bit to improving Bazel on Windows, though it is not related to my actual work at Google.)

m0zg · on May 1, 2019

You have to wonder though whether the effort spent building Build Accelerator would be better spent fixing Bazel bugs on Windows. Bazel really has no equal on the Linux side, and Google's internal version (+Forge and other systems) is far and away the best dev experience I've ever seen anywhere.

pjmlp · on May 1, 2019

I do Maven for Java and JVM languages, Gradle only because Google forces me to on Android, for better or worse CMake has won for C++ and MSBuild for .NET languages.

I am yet to see what Bazel does better, beyond faster Android builds, which is quite easy given Gradle and how Android team implements their plugins.

xmodem · on May 1, 2019

I just started working at a shop that uses Bazel for a large-ish Java backend codebase.

So far the key differences from maven, which I am most familiar with, are:

* The module format is much lighter, making it more practical to split a project into thousands of modules, which can then be built incrementally. This leads not just to faster build times, but finer-grained tracking of dependencies between modules

* The driving factor for us: Bazel has built-in support for specifying the sha256 hash of external dependencies. Verifying authenticity of external dependencies is not possible in maven without significant extra work. In Bazel it's built in.

* Much easier to set up bazel targets to perform once-off tasks like seeding a local database through your build system. While can you can do this with maven, you usually had to resort to plugins and the declarative style made it complex. Most maven projects I worked on had external shell or python scripts for these sorts of tasks.

For me personally the biggest downside has been that it takes over enough of the build-system responsibilities in IntelliJ that other tooling I'm used to that integrates with the IDE, such as Chronon or JRebel, doesn't work.

Bazel has a steep enough learning curve and enough additional complexity that I probably wouldn't recommend it for most projects unless you are at a certain size and complexity level where you can benefit form it.

m0zg · on May 1, 2019

Bazel can also pull in Maven dependencies if someone is a masochist enough to need that.

>> Bazel has a steep enough learning curve

I'd argue for basic tasks it has far less of a learning curve than e.g. Gradle or, god forbid, something like CMake.

jchw · on May 1, 2019

I love Bazel, but above all I hope open source can adopt Bazel more. It’s certainly much more usable for the case where you want to import external programs, versus systems like CMake, in my opinion.

bazza451 · on May 1, 2019

Don’t know if it’s just me but looked at Bazel a few weeks ago. Insane levels of complexity for just building multiple NodeJS projects.

Anyone know of anything similar but without that steep learning curve?

sterlind · on May 1, 2019

Nix is similar to Bazel, supports distributed cache + build + CI (via Hydra), and has a huge amount of existing build support tools. It's not typically used as the only build system though; it's more like glue that wires all your build systems together deterministically. Build steps (derivations) are simple bash scripts executed within a sandbox.

The only catch is that while the language and tools are simple, Nix is really a pure functional language and you have to treat your build process like code rather than config. It's easy to go down rabbit holes...

(I considered Bazel but it's very half-baked and lacks all of Blaze's proprietary toolchain support.)

bazza451 · on May 1, 2019

Cheers, will check it out

orbifold · on May 1, 2019

Its super simple to use for languages such as python or C++.

grumpyprole · on May 1, 2019

Bazel has not "been stable for years", perhaps you are thinking of "Blaze"? Bazel is full of brand new code and recent bugs (such as not releasing file handles) suggest it is not yet fully ready for mass adoption.

mikerg87 · on May 1, 2019

The why and how of this can be found here :

https://github.com/Microsoft/BuildXL/blob/master/Documentati...

Seems what drove this was a 90 build times of 90 hours for and end to end build of Office. What I can gather this has a means of capturing all read/write operations for a build step and placing it in a cache store to determine if a change necessitates the rebuild of a component. Since it can hook in at a lower level, it isn't sensitive to time stamps for building. Actually quite interesting

azhenley · on April 30, 2019

I interned with TSE in 2016 and had a blast. It is nice seeing one of their projects get open sourced.

whalesalad · on May 1, 2019

“Its own internal scripting language, DScript, an experimental TypeScript based format used as an intermediate language by a small number of teams inside Microsoft”

Anyone else notice this?

daemin · on May 1, 2019

Yes, but really is it any different to what every build system does by inventing its own DSL or making a new DSL in some other language?

Ant did it in XML, Premake uses Lua, CMake has its own, etc

scanr · on May 1, 2019

Anyone know a bit more about DScript?

pojntfx · on May 1, 2019

No Linux support, so who cares?

pjmlp · on May 1, 2019

All of us that don't use Linux daily.

acct1771 · on May 1, 2019

Anyone running build servers on Windows, or ReactOS.