MIPS Becomes RISC-V

zokier · on March 8, 2021

The progression of headlines is funny:

1) MIPS Strikes Back: 64-bit Warrior I6400 Arrives https://news.ycombinator.com/item?id=8258092

We are still in the game

2) Linux-running MIPS CPU available for free to universities – full Verilog code https://news.ycombinator.com/item?id=9444567

Okay, we are not doing so great, maybe we can get young kids hooked?

3) MIPS Goes Open Source https://news.ycombinator.com/item?id=18701145

Open Source is so hip and pop these days, lets do that!

4) Can MIPS Leapfrog RISC-V? https://news.ycombinator.com/item?id=19460470

Yeah, sure, that'll happen

5) Is MIPS Dead? Lawsuit, Bankruptcy, Maintainers Leaving and More https://news.ycombinator.com/item?id=22950848

Whoops

6) Loose Lips Sink MIPS https://news.ycombinator.com/item?id=24402107

And there is the answer to the question from previous headline

And now we are here.

rwmj · on March 8, 2021

I have one of the purple MIPS SBCs from back when MIPS was briefly owned by Imagination (https://en.wikipedia.org/wiki/Imagination_Creator https://elinux.org/MIPS_Creator_CI20). Slow as hell even back in 2014. I wonder if one day it'll be a museum piece :-?

Zenst · on March 8, 2021

Not any time soon as can still buy them: https://uk.rs-online.com/web/p/single-board-computers/125330...

jhallenworld · on March 8, 2021

I have a tube of IDT R3041s and R3051s. I remember using GCC compiled for a DECStation 2000 to write code for them. (I made a hand-held computer based on R3041).

I also have a tube of ARM 610s (VY86C060s I think) from the same project, but R3041 was PLCC, whereas ARM was fine pitch PQFP. PLCC was easier to deal with at the time...

cptskippy · on March 8, 2021

Probably shortly after the Z80s I pulled out of a PBX back in 2000 thinking they'd be hard to come by and worth something some day.

sidpatil · on March 9, 2021

You're telling me. I had a few Z80s in a tube for a while. I was hoping I'd get to sell them to NASA one day haha.

tyingq · on March 9, 2021

I've seen listings on eBay for Intel 4004s, 8008s, and other chips at what seem like crazy prices. I wonder if anyone buys them.

Here's someone thinking they'll get $120 for an old Russian Z80 clone: https://www.ebay.com/itm/RARE-vintage-gold-cpu-ceramic-Zilog...

Koshkin · on March 8, 2021

Microchip PIC32 seems to be plenty fast.

zokier · on March 8, 2021

Compared to what?

ChuckMcM · on March 8, 2021

Yes it will. You should preserve it and/or donate it once you no longer want or need it.

paulburton · on March 9, 2021

At least you have a purple one - they were a lot better than their green forebears :)

trulyme · on March 8, 2021

Is it something special? If no, then no.

rwmj · on March 9, 2021

In the sense that it's part of the history of the decline and fall of MIPS. In terms of performance, software availability, desirability or anything else, well not so much :-)

dv_dt · on March 8, 2021

I had to do a quick context switch to make sure by ISA they meant Instruction Set Architecture not Industry Standard Architecture (ISA Bus)...

brucehoult · on March 9, 2021

In 2010 MIPS wanted $2 million from Berkeley to allow them to use the MIPS instruction encodings for processor cores that Berkeley would design entirely themselves. So they made up their own encodings instead.

The rest is history.

In many ways modern MIPS and RISC-V are pretty much just different binary encodings of the same ideas.

[this was already posted as a comment in a thread, but on reflection it probably deserves its own]

segfaultbuserr · on March 9, 2021

> In 2010 MIPS wanted $2 million from Berkeley to allow them to use the MIPS instruction encodings

Do you have a source for that? I see that you're a RISC-V expert, and I know MIPS, Inc. is notorious for patent lawsuits, so I trust you. I'm just curious about the Berkeley project.

KirillPanov · on March 9, 2021

Patterson always smirks when people ask him what the "V" is for, and mumbles an uncharacteristically vague reply (something about it being the fifth chip project he's worked on or something like that...)

It's no coincidence that MIPS used roman numerals for its architectures, MIPS-I, MIPS-II, MIPS-III, MIPS-IV, and MIPS-V.

So instead of using MIPS-V Berkeley created RISC-V.

You aren't going to get anybody who was involved to say this on the record -- that they changed just barely enough to evade the licensing (instruction encoding) and trademark ("MIPS-V") problems. Openly admitting in print that it was a "minimum noninfringing change" simply invites lawsuits claiming that they were one hair's width on the wrong side of that line.

indolering · on March 9, 2021

I think you understate the level of IP vetting RISC-V has undergone. All but 6 instructions in the RV32G instruction set could be found in implementations that were at least 20 years old; the remaining six were novel [1].

RISC-V owes MIPS about as much as MIPS owes Patterson. MIPS-V was already ~15 years old when RISC-V was first used in the classroom. RISC-V purposefully paired back the layers of marketing cruft and failed experiments that had grown over Patterson et al.'s original RISC architecture.

There is less confidence in the extensions, but as RISC-V turns 10 ... patent trolls had better hurry if they want to extort anyone based on the ISA alone.

[1]: https://live-risc-v.pantheonsite.io/wp-content/uploads/2016/...

brucehoult · on March 9, 2021

Part of joining RISC-V International is signing a document irrevocably certifying that RISC-V doesn't infringe any of your patents, so that door is definitely closed for MIPS now anyway.

It is a little weird that there is an actual preference in proposing instructions for RISC-V extensions to demonstrate that the instruction was patented but the patent has expired, or at least that the instruction was publicly documented in some ISA at least a couple of decades ago.

By and large, RISC-V is not trying to be novel, but to bring together and simplify established best practice.

I did invent what is believed to be an entirely novel instruction for the RISC-V B extension: GORC. The other instruction GORC shares circuitry with, GREV, is also believed to not have been implemented in an ISA before, though it has been proposed in the literature.

brucehoult · on March 9, 2021

Well no that's not right.

There have been "minimum noninfringing change" MIPS clones, by doing something like leaving out the patented unaligned load and store instructions, but otherwise being identical and compatible with MIPS software and compilers etc.

RISC-V is completely different and incompatible with MIPS at the binary level. The opcodes are all different. The opcode and register fields are build from opposite ends of the word. The conditional branching model is different (MIPS r6 later copied RISC-V's version). The sizes of immediate and offset values is only 12 bits vs 16 in MIPS – which is a major reason for RISC-V having a lot more encoding space free for future instructions. There are of course no branch or load delay slots in RISC-V – something that MIPS again copied in r6.

There is a certain flavour that is similar, but the details are utterly different.

brucehoult · on March 9, 2021

Krste (or maybe Dave?) said so in a video somewhere not that long ago -- I assume in the RISC-V International channel on youtube but I don't recall which one.

segfaultbuserr · on March 9, 2021

Thanks.

rayiner · on March 9, 2021

Off topic, but good morning! (From a lurker on the #Dylan IRC oh 15 years ago...) Great to see you're doing Risc-V compiler stuff now.

brucehoult · on March 9, 2021

Cheers! Dylan will yet have its day :-) I was actually surprised how much positive response it get when I posted on /r/riscv about OpenDylan having been ported to RISC-V (no thanks to me :-( )

https://www.reddit.com/r/RISCV/comments/lqxwif/dylan_languag...

bsaul · on March 9, 2021

off topic : you seem to be working at sifive. Do you know if swift programming language is being used at all inside the company ?

brucehoult · on March 9, 2021

I did for two years, but I haven't for 13 months now.

Swift wasn't used for anything I was aware of when I was there -- Scala was popular. Now that Chris Lattner works there, who knows?

I've been doing a little Swift programming on my M1 Mini. It's pretty nifty.

mortenjorck · on March 8, 2021

This is more or less analogous to Blackberry moving to Android, isn’t it? Storied, old-guard tech company loses most of its market share, trades in its first-party stack for a rising open-source alternative.

Is MIPS still a big enough name to make this much of a coup for RISC-V? Or is this the last-ditch effort of a fallen star of the semi market?

abrowne · on March 8, 2021

They're even changing the name of the company to MIPS, just like RIM → Blackberry (https://news.ycombinator.com/item?id=26389870

One big difference is most consumers, and I think even many companies using the chips, don't really care what architecture they are using, unlike with an end-user OS/"ecosystem". So if they can use their abilities and experience from MIPS to make RISC-V chips with good price to performance, they could do OK.

MisterTea · on March 8, 2021

> Is MIPS still a big enough name to make this much of a coup for RISC-V? Or is this the last ditch effort of a fallen star of the semi market?

Mips has seemingly been on life support sine the late 90's. I kind of think the SGI buyout and later spin-off doomed them as they were focused on building high performance workstation processors while Arm was busy focusing on low power and embedded systems. Guess who was better prepared for the mobile revolution of the 00's?

mikepurvis · on March 8, 2021

It's interesting that that's where they've been focused, because my only exposure to MIPS chips has been in low-cost Mikrotik routerboards, eg:

https://mikrotik.com/product/RB450

tyingq · on March 8, 2021

I imagine MIPS (via Atheros, Broadcom, etc) was broadly deployed in things like home routers because it was royalty free, power efficient, and already had Linux kernel mainline support. Though probably losing share to ARM now.

theodric · on March 8, 2021

Indeed, Asus moved from MIPS to ARM between the RT-AC66U and RT-AC68U, and the various *pkg repos have since dropped support. MIPS may as well be dead.

rjsw · on March 8, 2021

Think they are used for the network offload engine in some ethernet adaptors too.

Keyframe · on March 8, 2021

Powered all SGI workstations up to the end (excluding Windows NT adventure SGI had), and powered both Nintendo 64 and Playstation (1). MIPS was quite hot back then.

pajko · on March 8, 2021

MIPS was being used in set-top-boxes and media players as well, like https://www.imaginationtech.com/blog/dont-let-the-cpu-contro...

slezyr · on March 8, 2021

Check your router's CPU. I own 4 routers and all of them use MIPS. RISC-V is more like Graphene, it's yet to leave the lab.

makomk · on March 8, 2021

Home routers used to be one of the last holdouts of MIPS, but all the modern ones have been switching to ARM. It's pretty much on its last legs there aside from the really cheap, low-end stuff.

dfox · on March 9, 2021

It is not that way today but for a long time the entry-level Cisco IOS router was PowerPC based, while the highend was invariably MIPS. On the other hand both architectural choices had nothing to do with the architecture and everything to do with choice of usable SoCs for that particular application (with Motorola/Freescale/NXP's "m68k Cisco router/Sun2 on a chip" SoCs being somewhat ironic in this regard).

jpleger · on March 9, 2021

I wouldn't consider cisco to be a tech-forward company or look to them as the barometer to where the networking industry is headed.

Most of Broadcoms stuff is ARM based now, which is what most folks are putting in white label switches.

bnjms · on March 9, 2021

Fwiw, Broadcom is also what’s going into Cisco routers. Which supports your point.

cptskippy · on March 8, 2021

> it's yet to leave the lab.

You can buy them on Amazon.

https://www.amazon.com/dp/B08W2J9B8J

https://www.mouser.com/ProductDetail/?qs=pUKx8fyJudB1sOWbbEn...

chrisseaton · on March 9, 2021

> You can buy them on Amazon.

> Currently unavailable. We don't know when or if this item will be back in stock.

Clearly not!

pantalaimon · on March 9, 2021

How about these then

https://aliexpress.com/item/4000221852639.html

https://aliexpress.com/item/1005001762637137.html

https://aliexpress.com/item/4000394347886.html

MarcusE1W · on March 9, 2021

This board is BL602 based. Available and has a growing community.

https://pine64.com/product/pinecone-bl602-evaluation-board/

FullyFunctional · on March 11, 2021

I have a few of them blinking next to me. They are fine albeit documentation is severely lacking at the moment, but once you realize how cheap they are compared to the alternatives, new possibilities open up.

Interestingly it's based on a 3 stage core from SiFive, not the RocketChip.

simcop2387 · on March 9, 2021

Shows as in stock on the 15th for me. I suspect there's some bot arbitrage going on there.

ksec · on March 9, 2021

Most newer generation, upcoming replacement of lower cost SKUs, including NAS are all going to ARM.

At one point there were a few more MIPS and even OpenPOWER solution on NAS and router, but at the end having everything on ARM is just so much easier. Little sad to see it go. Especially good old Broadcom MIPS tends to be exceptionally stable.

gnufx · on March 8, 2021

Ho, ho. Graphene has left the lab, but it's only got as far as the hype mill, not the factory. https://www.manchester.ac.uk/discover/news/the-university-un... was a notable success though; and allegedly graphene-powered trainers there on a different date.

In contrast, with RISC-V you can put your hand in your pocket for early models, can't you?

FullyFunctional · on March 11, 2021

2021 to me really marked at turning point for RISC-V embedded cores. $4 (+ S/H) for a BL602-based pinecone (or any of the many other BL602 boards). Pretty nice.

bigbillheck · on March 9, 2021

Tell it to Digikey: https://www.digikey.com/en/products/detail/seeed-technology-...

(I've got a single board computer from those people that I got in, I think, 2019)

cjdaly · on March 9, 2021

even the BBC is pushing RISC-V: https://www.sparkfun.com/products/17596

vardump · on March 9, 2021

All of the recent home routers are ARM based. Maybe yours are at least a few years old?

askvictor · on March 9, 2021

I had a cheapie 'smart' TV a while ago which was running Android on a MIPS CPU

UncleOxidant · on March 8, 2021

> Is MIPS still a big enough name to make this much of a coup for RISC-V?

No, not at this point. RISC-V already has plenty of momentum without this. The only thing that might change that analysis is if MIPS has some architecture patents that they could leverage to get some kind of RISC-V performance advantage. But I doubt they have anything like that now. And it's not like they have a stable of CPU designers that are now going to be switched from working on MIPS to working on RISC-V - they likely haven't had any of those folks working there since the 90s.

scj · on March 8, 2021

It likely forces an inflection point for existing MIPS users:

1. Stick with MIPS for as long as possible.

2. Follow MIPS into RISC-V.

3. Or go to some other camp.

jpleger · on March 9, 2021

> Is MIPS still a big enough name to make this much of a coup for RISC-V? Or is this the last-ditch effort of a fallen star of the semi market?

Not really, I think the only thing they have going for them is that they probably have IP cores that would port easily and have history. However, places like SiFive [1] appear to have early mover advantage and are likely to be quicker to gain critical mass.

[1] - https://www.sifive.com/risc-v-core-ip

musicale · on March 8, 2021

There's a lot to like about MIPS. It's a perfectly usable RISC architecture that:

- is easy to implement

- is supported by Debian, gcc, etc..

- is virtualizable

- scales from embedded systems (e.g. compressed MIPS16 ISA) up to huge shared-memory multiprocessor systems with hundreds of CPUs

Like RISC-V, MIPS traces its lineage to the dawn of the RISC revolution in the 1980s, though on the Hennessy/Stanford side rather than the Patterson/Berkeley side.

And it was supposed to go open source: https://www.mips.com/mipsopen/

but that effort sadly seems to be dead: http://mipsopen.com

That's really too bad. MIPS doesn't deserve to die.

Fortunately as mentioned above it will live on as long as there are PS2 consoles or emulators around.

brucehoult · on March 9, 2021

There is no, one, MIPS ISA. They've been through a number of incompatible changes over the years. MIPS r6 added some things more like RISC-V. NanoMIPS loops even more like RISC-V, though with its own twist (and with some 48 bit instructions, which RISC-V doesn't have yet).

In many ways modern MIPS and RISC-V are pretty much just different binary encodings of the same ideas.

In 2010 MIPS wanted $2 million from Berkeley to allow them to use the MIPS instruction encodings for processor cores that Berkeley would design entirely themselves. So they made up their own encodings instead.

The rest is history.

thr0w__4w4y · on March 10, 2021

Excellent point. MIPS overplayed its hand.

It's funny how this story repeats itself over and over again, yet companies never seem to learn.

The flipside of this is when companies like Apple, Microsoft and Adobe, like it or not, become part of the de facto tech learning arc for college students, who one day grow up to be senior engineers, lead engineers, architects and principal engineers.

hamburglar · on March 8, 2021

- the NOP instruction is 0x0

:)

TorKlingberg · on March 8, 2021

I don't know if MIPS is the same, but I worked on an other architecture where NOP is 0x0, and it had an interesting effect. If you called an uninitialized function pointer, and it happened to point into zero:ed out memory, the CPU would execute NOPs for a good while until it hit something else. If that something else was code, it would start executing some function from the start, but with garbage arguments. It would often get quite far in and several functions calls down before something crashed. Made for fun stack traces and interesting debugging :-)

tachyonbeam · on March 9, 2021

Seems like a better design choice would be to have instruction 0x0 throw some kind of page fault violation or other exception the OS can catch then.

kindall · on March 9, 2021

Shades of the 6502, where 0 was BRK

FullyFunctional · on March 11, 2021

It is and on RISC-V 0 (16-bit bits and 32-bit) are an illegal instruction. Jumping to zeroed page is obviously a bug so this is the correct behavior.

mokus · on March 9, 2021

AVR does this. I can attest to the “interesting” debugging...

paulburton · on March 9, 2021

nanoMIPS (the last compressed MIPS ISA that sadly never made it across the whole product line) re-encoded nop(32) for much the same reason.

brucehoult · on March 9, 2021

RISC-V deliberately made sure both all 0s and all 1s are forever illegal instructions, in all instruction lengths.

titzer · on March 9, 2021

Wasm too. Bytecode 0 is "unreachable" and executing it produces a trap.

mycall · on March 9, 2021

Specifically to avoid infinite loops?

brucehoult · on March 9, 2021

To make hitting uninitialised RAM or Flash or OTP (or others) stop the program immediately.

MaxBarraclough · on March 8, 2021

Everyone's right to celebrate the success of RISC-V, but part of me thinks it's a shame that there's relatively little architectural diversity (edit I should have said ISA diversity) in modern CPUs. MIPS, Alpha, and Super-H, have all but faded away. Power/PowerPC is still out there somewhere though. Apparently they're still working on SPARC, too. [0]

At least we'll always have the PS2. ...until the last one breaks, I guess.

[0] https://en.wikipedia.org/wiki/SPARC

elihu · on March 8, 2021

I wish the barriers to using new architectures were lower.

For instance, suppose binaries were typically distributed in a platform-agnostic format, like LLVM intermediate representation or something equivalent. When you run your program the first time, it's compiled to native code for your architecture and cached for later use.

I realize I've sort of just re-invented Javascript. But what if we just did away with native binaries entirely, except as ephemeral objects that get cached and then thrown away and regenerated when needed? It seems like this would solve a lot of problems. You could deprecate CPU instructions without worrying about breaking backwards compatibility. If a particular instruction has security or data integrity issues, just patch the compiler not to emit that instruction. As new side-channel speculation vulnerabilities are discovered, we can add compiler workarounds whenever possible. If you're a CPU architect and want to add a new instruction for a particular weird use-case, you just have to add it to your design and patch the compiler, and everyone can start using your new instruction right away, even on old software. You'd be able to trust that your old software would at least be compatible with future instruction architectures. Processors would be able to compete directly with each other without regard to vendor-lock-in.

retrac · on March 8, 2021

That's how IBM implemented the AS/400 platform. Everything compiled down to a processor-agnostic bytecode that was the "binary" format. That IR was translated to native code for the underlying processor architecture as the final step. And objects contained both the IR and the native code. If you moved a binary to another host CPU, it would be retranslated and run automatically. The migration to POWER as the underlying processor was almost entirely transparent to the user and programming environment.

https://en.wikipedia.org/wiki/IBM_System_i#Instruction_set

skissane · on March 8, 2021

> That's how IBM implemented the AS/400 platform. Everything compiled down to a processor-agnostic bytecode that was the "binary" format

Originally, AS/400 used its own bytecode called MI (or TIMI or OMI). A descendant of the System/38's bytecode. That was compiled to CISC IMPI machine code, and then after the RISC transition to POWER instructions.

However, around the same time as the CISC-to-RISC transition, IBM introduced a new virtual execution environment – ILE (Integrated Language Environment). The original virtual execution environment was called OPM (Original Program Model). ILE came with a new bytecode, W-code aka NMI. While IBM publicly documented the original OPM bytecode, the new W-code bytecode is only available under NDA. OPM programs have their OMI bytecode translated internally to NMI which then in turn is translated to POWER instructions.

The interesting thing about this, is while OMI was originally invented for the System/38, W-code has a quite different heritage. W-code is actually the intermediate representation used by IBM's compilers (VisualAge, XL C, etc). It is fundamentally the same as what IBM compilers use on other platforms such as AIX or Linux, and already existed on AIX before it was ever used on OS/400. There are some OS/400-specific extensions, and it plays a quite more central architectural role in OS/400 than in AIX. But W-code is conceptually equivalent to LLVM IR/bitcode. So here we may see something in common with what Apple does with asking for LLVM bitcode uploads for the App Store.

> And objects contained both the IR and the native code. If you moved a binary to another host CPU, it would be retranslated and run automatically

Not always true. The object contains two sections – the MI bytecode and the native machine code. It is possible to remove the MI bytecode section (that's called removing "observability") leaving only the native machine code section. If you do that, you lose the ability to migrate the software to a new architecture, unless you recompile from source. I think, most people kept observability intact for in-house software, but it was commonly removed in software shipped by IBM and ISVs.

inkyoto · on March 9, 2021

App uploads to the iOS store in the LLVM's own Bitcode format is a distant echo of the CPU ISA agnostic IR approach IBM employed at the time. Bitcode is transpiled down to the underlying CPU instructions via the static binary translation and optimisation, and the translation between Bitcode -> x86 or Bitcode -> ARM has been possible to do for some time: https://www.highcaffeinecontent.com/blog/20190518-Translatin...

Rosetta 2 AOT, whilst not being exactly the same thing as the ISA agnostic IR solution, is another example of the static binary translation. Theoretically, Apple could start requiring OS X app submissions to the app store in the Bitcode format as well, so they could be transpiled and optimised at the app download time and perform efficiently on M3, M4, M5 etc CPU's in the future. However, with their habit of obsoleting certain things fast, it is not clear whether they will choose to go down the Bitcode path for the OS X apps.

bogomipz · on March 9, 2021

This is really fascinating. Is there a reason why we haven't seen more of this approach as it seems like it was pretty successful for IBM. Is there a practical reason that would prevent an open source project from building something similar?

dataangel · on March 8, 2021

> For instance, suppose binaries were typically distributed in a platform-agnostic format, like LLVM intermediate representation or something equivalent.

The Mill does something like this, but only for their own chips. "Binaries" are bitcode that's not specialized to any particular Mill CPU, and get run through the "specializer" which knows the real belt width and other properties to make a final CPU-specific version.

brucehoult · on March 8, 2021

"does" is not exactly the right word here. "Proposes to do"?

MaulingMonkey · on March 8, 2021

> I realize I've sort of just re-invented Javascript.

Or one of several bytecodes that get JIT or AOT compiled.

WASM in particular has my interest these days, thanks to native browser support and being relatively lean and more friendly towards "native" code, whereas JVM and CLR are fairly heavyweight, and their bytecodes assume you're going to be using a garbage collector (something that e.g. wasmtime manages to avoid.)

Non-web use cases of WASM in practice seem more focused on isolation, sandboxing, and security rather than architecture independence - stuff like "edge computing" - and I haven't read about anyone using it for AOT compilation. But perhaps it has some potential there too?

MaxBarraclough · on March 9, 2021

I think the short answer is that the performance penalty is so significant that it doesn't make sense to use WASM unless you're running untrusted code.

jacquesm · on March 8, 2021

> I realize I've sort of just re-invented Javascript.

You've re-invented UCSD P-code.

https://en.wikipedia.org/wiki/P-code_machine

brucehoult · on March 9, 2021

BCPL ocode is quite a bit older the UCSD P-code.

The BCPL compiler always outputs ocode. Then a back end for the particular system statically generates machine code (or assembly language)

Back in 1983 I helped take a BCPL running on a VAX and add a M6809 back end to it.

jacquesm · on March 9, 2021

Ah yes, it even rates a mention on that Wikipedia page. Thank you for pointing this out. 1966! Funny how all these supposedly 'new' concepts really go back all the way to the beginning.

pjmlp · on March 9, 2021

Like safe systems languages, first one 1961.

Ericson2314 · on March 8, 2021

Come to Nix and Nixpkgs, where we can cross compile most things in myriad ways. I think the barriers to new hardware ISAs on the software side have never been lower.

Even if we get an ARM RISC-V monoculture, at least we are getting diverse co-processors again, which present the same portability challenges/opportunities in a different guise.

pabs3 · on March 9, 2021

> I wish the barriers to using new architectures were lower.

Efforts like Bootstrappable Builds and Debian's rebootstrap aim to reduce the barriers for new architectures:

https://bootstrappable.org/ https://bootstrapping.miraheze.org/wiki/Main_Page https://wiki.debian.org/DebianBootstrap https://wiki.debian.org/HelmutGrohne/rebootstrap

For eg the Synopsis folks are bootstrapping a Debian ARC port right now using rebootstrap.

https://github.com/foss-for-synopsys-dwc-arc-processors/rebo...

msla · on March 8, 2021

> For instance, suppose binaries were typically distributed in a platform-agnostic format, like LLVM intermediate representation or something equivalent. When you run your program the first time, it's compiled to native code for your architecture and cached for later use.

IBM's OS/400 (originally for the AS/400 hardware, now branded as System i) did precisely this: Compile COBOL or RPG to a high-level bytecode, which gets compiled to machine code on first run, and save the machine code to disk; thereafter, the machine code is just run, until the bytecode on disk is changed, whereupon it's replaced with newer machine code. IBM was able to transition its customers to a new CPU architecture just by having them move their bytecode (and, possibly, source code) from one machine to another that way.

https://en.wikipedia.org/wiki/IBM_System_i

Other OSes could definitely do it.

skissane · on March 8, 2021

> Other OSes could definitely do it.

See https://en.wikipedia.org/wiki/Architecture_Neutral_Distribut...

Using ANDF you could produce portable binaries that would run on any UNIX system, regardless of CPU architecture. It was never commercially released though. I think while it is cool technology the market demand was never really there. For a software vendor, recompiling to support another UNIX isn't that hard; the real hard bit is all the compatibility testing to make sure the product actually works on the new UNIX. ANDF solved the easy part but did nothing about the hard bit. It possibly would even make things worse, because then customers might have just tried running some app on some other UNIX the vendor has never tested, and then complain when it only half worked.

Standards are always going to have implementation bugs, corner cases, ambiguities, undefined behaviour, feature gaps which force you to rely on proprietary extensions, etc. That's where the "hard bit" of portability comes from.

grishka · on March 9, 2021

You've just reinvented bytecode. JVM/ART, WebAssembly, ActionScript, some versions of .net... You know, all the stuff that supposedly runs on everything.

LargoLasskhyfv · on March 12, 2021

1989 is calling with a blast from the past:

https://en.wikipedia.org/wiki/Architecture_Neutral_Distribut...

It didn't last.

gnufx · on March 8, 2021

Surely the barrier to using a new architecture is being able to boot a kernel and run (say) the GNU toolchain, as demonstrated with RISC-V. Then you just compile your code, assuming it doesn't contain assembler, or something. Whether or not you'll have the same sort of board support issues with RISC-V as with Arm, I don't know.

saagarjha · on March 8, 2021

This sounds a bit like Google’s Portable Native Client.

Wowfunhappy · on March 10, 2021

(Which is deprecated now, and scheduled for removal.)

TheRealSteel · on March 9, 2021

Isn't that basically what Android Runtime does?

elihu · on March 9, 2021

I don't know much about Android specifically. Is it still heavily Java-based?

There are a lot of universal-binary candidates both current and historical. Javascript, Java, SPirV, LLVM intermediate representation are some of the current ones. So, this isn't a new idea, it's just that most of the software I use regularly is compiled specifically for x86-64. Maybe it would be better if that were a rare exception rather than the norm.

pjmlp · on March 9, 2021

Definitely, the use of Linux kernel is a pyrrhic victory only celebrated by those without native Android development experience.

NDK's main purpose, from Android team point of view, is for writing native libraries.

MaxBarraclough · on March 8, 2021

> I wish the barriers to using new architectures were lower.

> For instance, suppose binaries were typically distributed in a platform-agnostic format, like LLVM intermediate representation or something equivalent.

We're doing a pretty good job on portability these days already. Well-written Unix applications in C/C++ will compile happily for any old ISA and run just the same. Safe high-level languages like JavaScript, Java, and Safe Rust are pretty much ISA-independent by definition, it's 'just' a matter of getting the compilers and runtimes ported across.

Adopting LLVM IR for portable distribution, probably isn't the way forward. I don't see that it adds much compared to compiling from source, and it's not what it's intended for. (LLVM may wish to change the representation in a subsequent major version, for instance.)

For programs which are architecture-sensitive by nature, such as certain parts of kernels, there are no shortcuts. Or, rather, I'm confident the major OSs already use all the practical shortcuts they can think up.

> When you run your program the first time, it's compiled to native code for your architecture and cached for later use.

Source-based package management systems already give us something a lot like this.

There are operating systems that take this approach, such as Inferno. [0] I like this HN comment on Inferno [1]: kernels are the wrong place for 'grand abstractions' of this sort.

> I realize I've sort of just re-invented Javascript

Don't be too harsh on yourself, JavaScript would be a terrible choice as a universal IR!

> [[ to the bulk of your second paragraph ]]

In the Free and Open Source world, we're already free to recompile the whole universe. The major distros do so as compiler technology improves.

> Processors would be able to compete directly with each other without regard to vendor-lock-in.

For most application-level code, we're already there. For example, your Java code will most likely run just as happily on one of Amazon's AArch64 instances as on an AMD64 machine. In the unlikely case you encounter a bug, well, that's pretty much always a risk, no matter which abstractions we use.

[0] https://en.wikipedia.org/wiki/Inferno_(operating_system)

[1] https://news.ycombinator.com/item?id=9807777

karteum · on March 9, 2021

> "Adopting LLVM IR for portable distribution, probably isn't the way forward. I don't see that it adds much compared to compiling from source, and it's not what it's intended for. (LLVM may wish to change the representation in a subsequent major version, for instance.)"

Maybe, but PNaCl (unfortunately deprecated by Google) "defines a low-level stable portable intermediate representation (based on the IR used by the open-source LLVM compiler project) which is used as the wire format instead of x86 or ARM machine code" https://www.chromium.org/nativeclient/pnacl/introduction-to-...

MaxBarraclough · on March 9, 2021

Sure. I'm not saying it's impossible to construct such an IR and get it to work, I'm saying I doubt it's the best way forward. See my other comment [0] where I mention Google Native Client.

It would be a poor fit for certain languages, there may be performance penalties depending on target platform, it would preclude legitimate platform-specific code such as SIMD assembly, it would preclude platform-specific build-time customization, etc.

The way toward painless portability is to move away from unsafe languages like C and C++, where you're never more than an expression away from undefined behaviour, and where programmers may be tempted to make silly mistakes like writing code sensitive to the endianness of the target architecture. [1] With C and C++, disciplined developers working carefully, can write portable code. With Safe Rust, code can be pretty close to 'portable by construction', like Java. If you feed Windows-style path strings to Linux, or vice versa, then things might go wrong, but for the most part you'll be on solid ground.

[0] https://news.ycombinator.com/item?id=26398199

[1] https://commandcenter.blogspot.com/2012/04/byte-order-fallac...

elihu · on March 9, 2021

> Well-written Unix applications in C/C++ will compile happily for any old ISA and run just the same. Safe high-level languages like JavaScript, Java, and Safe Rust are pretty much ISA-independent by definition, it's 'just' a matter of getting the compilers and runtimes ported across.

That sort of works, but duplicating the proper development environment in the end-user's computer would take a lot of space and would be complicated by the enormous variety of programming languages and environments. Linux distros manage this with a lot of effort. I'm imagining something like a universal intermediate representation that can be compiled quickly (because a lot of the early language-specific part of compilation will have already been done by whoever you get your packages from) and in a uniform way because there's a common intermediate representation format that all the compiled languages use.

Universal binaries might also be acceptable for commercial, closed-source applications where source distribution would not.

MaxBarraclough · on March 9, 2021

> duplicating the proper development environment in the end-user's computer would take a lot of space

Most distros offer precompiled binaries, there are relatively few that use source-based distribution and expect the user to have all the necessary compilers installed.

> would be complicated by the enormous variety of programming languages and environments

That problem isn't effectively addressed by a universal IR. You can't have a single IR that works well for all languages, precisely because of the variety of languages.

> Linux distros manage this with a lot of effort.

Hopefully that should improve if the trend toward languages like Safe Rust continues. C and C++ are infamously full of footguns.

> I'm imagining something like a universal intermediate representation that can be compiled quickly (because a lot of the early language-specific part of compilation will have already been done by whoever you get your packages from) and in a uniform way because there's a common intermediate representation format that all the compiled languages use.

Again this can't be done effectively. There are good technical reasons why Java, Haskell, and JavaScript, don't generally use LLVM as their backend. The differences between languages aren't just skin deep, they extend right through the compiler stack.

To be more precise: it could be done, but there would be an unacceptable performance cost. After all, you could start distributing binaries for the SuperH SH-4, and just use emulation everywhere. The question is whether it could be done effectively.

I mentioned before that LLVM IR is not intended to be used this way, although the Google Native Client project took LLVM and turned it into what you're suggesting.

C and C++ are quite different from Java. The size of the int type varies between platforms, for instance. They also have a preprocessor which allows the programmer to conditionally compile platform-specific code, e.g. intrinsics, fragments of assembly code, or workarounds. The program might use system-specific macros that expand before compilation.

Languages like Haskell are very different from the sorts of languages that LLVM is built for. Even Java prefers to use its own backend, with tight integration with its GC.

There's also a package-management question, although this issue wouldn't be as significant. The C/C++ way is to have the build system (autotoools or CMake or whatever) detect what libraries are available on the system. If an optional library is missing, the C/C++ code is automatically adjusted by the build system, prior to compilation. It would be unusual to detect availability of libraries at runtime. This approach doesn't play nicely with a universal IR. This might not be an issue if the IR is treated as a surrogate for the native-code binary, but the IR wouldn't be a good surrogate for the source.

The C/C++ philosophy is to accommodate platform variations, in contrast with the JVM approach of mandating compliance to a virtual machine. With the JVM approach you forbid the sorts of variations that C and C++ permit (everything from int_fast32_t varying between platforms, to hand-written SIMD assembly).

Others have already mentioned WASM and Google Native Client, both of which are stable, but neither of which are going to become mainstream ways of distributing Unix application code.

This topic has turned up on HN before, but frustratingly I wasn't able to find the thread.

> Universal binaries might also be acceptable for commercial, closed-source applications where source distribution would not.

True, but I think modern Unix OSs do a pretty good job on ABI stability. If they want portability without releasing source (something I don't think GNU/Linux should aim to accommodate, incidentally) they already have other options, like Java. JavaFX doesn't get much attention but it pretty much 'just works' for portable GUI applications.

edit skissane has an interesting comment on ANDF, a solution I hadn't heard of before.

xiphias2 · on March 8, 2021

What RISC-V achieves is architectural diversity over the boring mov, add, mul instructions: the interesting part is in vector and matrix manipulation, and while RISC-V is working on a great solution, it allows for other accelerators to be added.

Fordec · on March 8, 2021

We need diversity for solving different problems, not for diversity sake.

What problem did MIPS solve in a unique way that others didn't? Because it wasn't desktop, mobile, embedded, graphics or AI.

zdw · on March 8, 2021

SPARC is well known to be different enough (big endian, register windowing of the stack, alignment, etc.) that it exposes a lot of bugs in code that would be missed in a little-endian, x86 derived monoculture.

https://marc.info/?l=openbsd-bugs&m=152356589400654&w=2

simias · on March 8, 2021

I always found SPARC's stack handling to be very elegant and I write enough of low level code that these architectural details do from time to time impact me, but isn't it largely irrelevant for the industry at large?

After all MIPS's original insight was that machine code was now overwhelmingly written by compilers and not handwritten assembly, so they made an ISA for compilers. I think history proved them absolutely right, actually these days there are often a couple of layers between the code people write and the instructions fed into the CPU.

I guess my point is that nowadays I'm sure that many competent devs don't know what little-endian means and probably wouldn't have any idea of what "register windowing of the stack" is, and they're completely unaffected by these minute low level details.

Making it a bit easier for OpenBSD to find subtle bugs is certainly nice, but that seems like a rather weak argument for the vast amount of work required to support a distinct ISA in a kernel.

Honestly I'm not convinced by the argument for diversity here, as long as the ISA of choice is open source and not patent encumbered or anything like that. Preventing an x86 or ARM monoculture is worth it because you don't want to put all your eggs in Intel or Nvidia's basket, but if anybody is free to do whatever with the ISA I don't really see how that really prevents innovation. It's just a shared framework people can work with.

Who knows, maybe somebody will make a fork of RISC-V with register windows!

goatinaboat · on March 9, 2021

I'm sure that many competent devs don't know what little-endian means and probably wouldn't have any idea of what "register windowing of the stack"

These days "bare metal programming" means writing raw HTML yourself in a text editor, no CSS, no JS.

tyingq · on March 8, 2021

I had a neat experience a long time ago when I wrote a Perl XS module in C, in my x86 monoculture mindset. When you deploy something to their package manager (CPAN), it's automatically tested on a lot of different platforms via a loose network of people that volunteer their equipment to test stuff...https://cpantesters.org.

So, I immediately saw it had issues on a variety of different platforms, including an endianess problem. Cpantesters.org lets you drill down and see what went wrong in pretty good detail, so I was able to fix the problems pretty quickly.

It used to have a ton of different platforms like HPUX/PA-RISC, Sun/Sparc, IRIX/MIPS and so on, but the diversity is down pretty far now. Still lots of OS's, but few different CPUs.

hajile · on March 8, 2021

MIPS and Berkeley RISC started an entire revolution. They appear "not unique" only because other ISAs copied them so thoroughly. I think it's safe to say that Alpha, ARM, POWER, PA-RISC, etc wouldn't have been designed as they were without MIPS.

Even today, comparing modern MIPS64 and ARM aarch64, I find ARM's new ISA to be perhaps more similar to MIPS than to ARMv7.

sidpatil · on March 9, 2021

> They appear "not unique" only because other ISAs copied them so thoroughly.

https://tvtropes.org/pmwiki/pmwiki.php/Main/SeinfeldIsUnfunn...

guerrilla · on March 9, 2021

I think POWER might have originally been an independent rediscovery of RISC actually, while designing some PBX system.

retrac · on March 9, 2021

> What problem did MIPS solve in a unique way that others didn't?

The MIPS R2000 was debatably the first commercial RISC chip. It solved whatever problem you needed a really fast CPU for in 1985. The alternatives on the market were the Intel 386 and the Motorola 68000. The Intel 386 at 16 MHz did about 2 MIPS (heh - millions of instructions per second) with 32 bit integer math. At 16 MHz, the R2000 did about 10 MIPS. Even accounting for RISC code bloat, that's 3 - 4x faster.

Note how there were only two competitors selling 32-bit designs in the market they entered. I think that's probably the biggest impact of MIPS. They actually sold the chip! They wanted companies to design their own computer systems around it. Use it in an embedded device. Whatever. That was not the norm c. 1985 - 1988 for high-end silicon.

There were machines faster than the 386 or 68020 at that time. You could buy one of the fast microprocessor-based VAXes recently introduced. Or if not too squeezed for office space and with a blank cheque, one of the super-minis like a real VAX or IBM's new "mini-mainframe". After '86, maybe you'd buy one of the other RISC options, like SPARC or PA-RISC.

Whatever you bought, it would be the whole system. Take it or leave it. DEC would not sell you something like a CVAX processor all by itself just so you can build it into a product that will compete against them. (Well, they would sell you one, just not at a price you could afford if you aren't a defence contractor.)

Both DEC and SGI would use MIPS processors in their workstations of the late 80s, as did some less well-known names. The embarrassment of having to use a competitor's processor to sell a decently fast and affordable UNIX workstation would inspire DEC to create the Alpha. In this vein of "we'll sell it to whoever wants to buy it!" MIPS was also doing ARM-style core IP licensing, before ARM did. That's probably part of why MIPS was so prominent as an embedded architecture in the late 90s and early 2000s, in everything from handhelds to routers to satellites.

yjftsjthsd-h · on March 8, 2021

Throwing out diversity because you don't see any immediate benefit is a great way to not have it when a different problem does show up.

I don't know about unique way, but MIPS certainly was good at embedded; there's plenty of networking gear using it.

capableweb · on March 8, 2021

> MIPS certainly was good at embedded

As far as I understand, it's not that MIPS is the best at embedded, it's just that it's cheaper to sell as the license cost is non-existing and good support already exists in kernels and so on.

MaxBarraclough · on March 8, 2021

What are the criteria for 'best'?

If MIPS offered adequate performance and features, good performance-per-watt, and a competitive licence fee, and if none of its competitors could beat it, doesn't that count as 'best'?

yjftsjthsd-h · on March 9, 2021

> it's not that MIPS is the best at embedded, it's just that it's cheaper to sell

That sounds a lot like MIPS being the best at embedded. Not high-end, sure, but a lot of embedded is "what is the cheapest processor that can run Linux?"

jnwatson · on March 8, 2021

PowerPC was the dominant processor in telecom equipment as far as I was aware. Perhaps the low end went MIPS.

pabs3 · on March 9, 2021

ISA diversity can surface bugs in code, that only shows up in ISAs with different approaches. For example code that works on some arches but is slow due to alignment issues will just fail to run on other arches.

bitwize · on March 8, 2021

MIPS was a nice simple ISA for CE students to learn and implement. Both ARM and RISC-V have gotchas that make them more complicated.

I suppose it will live on in that form, especially if the IP is opened up.

schoen · on March 8, 2021

I remember CS 61C at Berkeley used to use MIPS to teach assembly language programming and a bit about computer architecture, using the original MIPS version of Patterson and Hennessy's Computer Organization and Design. Now that book is available in both MIPS and RISC-V versions, with, I've assumed, much more effort going into the RISC-V version...

I do think the simplicity of MIPS was a big plus there, including simplicity of simulating it (http://spimsimulator.sourceforge.net/). I suppose a lot of students may appreciate being taught on something that is or is about to be very widely used, even if it's more complicated in various ways -- and the fact that one of the textbook authors was a main RISC-V designer makes me assume that educational aspects are not at all neglected in the RISC-V world.

saagarjha · on March 8, 2021

Not entirely related, but I found MARS (http://courses.missouristate.edu/KenVollmar/MARS/) to be much nicer to use than SPIM.

More on topic, though, RISC-V seems to really be designed in a way that makes it easy to teach. This is partially why I have doubts that it can be made very performant, but the focus of a prettier design over a more practical one is probably going to help it be more accessible to students.

cptskippy · on March 8, 2021

> part of me thinks it's a shame that there's relatively little architectural diversity

Perhaps CPU diversity is in decline but it seems to me that the industry as a whole is moving towards more diversity. It's gotten significantly cheaper to roll your own chips to the point that we've seen entirely new processors emerging (e.g. GPUs, TPUs, etc) and becoming commonplace if not essential.

Isn't the point of RISC-V that the CPU is simple and augmented or complimented by any number of custom co-processors? If this is the general tend in the industry then the CPU itself might become a commodity part to be easily swapped out as something better emerges. Particularly if it can be abstracted away from the ISA.

spamizbad · on March 8, 2021

I feel like MIPS and RISC-V are so closely related they're not terribly diverse. Academic MIPS evolved into RISC-V.

Taniwha · on March 8, 2021

Nope - they have different competing heritages - in this case the headline is "Berkeley beats Stanford"

https://www.youtube.com/watch?v=09kPcg8Hehg

monocasa · on March 8, 2021

Academic RISC was designed by Patterson and Hennessy. Hennessy went off and was one of the founders of MIPS, Patterson is one of the instrumental leaders in the RISC-V space.

brucehoult · on March 9, 2021

Patterson and Hennessy were in competition with each other, at different universities. It's only much later they wrote text books together.

Patterson says RISC-V is derived from RISC-I and RISC-II. I think this doesn't really old water -- at least no more so than any other RISC.

RISC-I and RISC-II had condition codes and register windows, like SPARC. RISC-V doesn't have either, like MIPS. The RISC-V assembly language is also very similar to MIPS.

RISC-II had both 16 and 32 bit opcodes (as did IBM 801, and Cray designs) and RISC-V has inherited this (but also following the great success of ARM Thumb2).

guerrilla · on March 9, 2021

Nice that there's both MIPS and RISC-V versions of the book:

https://www.amazon.com/Computer-Organization-Design-MIPS-Arc...

https://www.amazon.com/Computer-Organization-Design-RISC-V-A...

I guess as of today we know which one to get.

ksec · on March 9, 2021

>Power/PowerPC is still out there somewhere though.

https://openpowerfoundation.org

https://github.com/antonblanchard/microwatt

Unfortunately it isn't gaining any traction. From a Long term Cost perspective it is actually cheaper choosing ARM even if OpenPOWER is free. And ARM is already inexpensive.

pabs3 · on March 9, 2021

I hear the RPi has nine different processor clusters, each with their own ISA.

dralley · on March 8, 2021

I'm curious if someone could explain the architectural differences between ppc64le and aarch64? I've always heard they are quite similar.

arnd · on March 9, 2021

MIPSr6, aarch64 and riscv are siblings born from MIPSr5 plus the best of other RISC architectures.

ppc64le and 32-bit Arm are part of the wider family, but have some notable differences that make them slightly less RISC-like. Both include e.g. more complex condition code handling and instructions that operate on more than three registers.

tyingq · on March 8, 2021

Since you mentioned ppc64le, there's also aarch64eb (arm64 in big endian mode). I saw that NetBSD supports it. It seems like support for other operating systems is limited mostly because of issues around booting...not the actual kernel or userland itself.

phendrenad2 · on March 9, 2021

There isn't much diversity in software, why hardware? Maybe standardization is a strength.

MaxBarraclough · on March 12, 2021

There's a lot of diversity in software. Windows is quite different from GNU/Linux, for instance. There's quite a lot of diversity in the major programming languages.

If you want something really different, whether in operating systems or programming languages, you have it. KolibriOS and Haskell, say.

st_goliath · on March 8, 2021

I just read the official statement[1] that's linked to in the article.

Just so I get this straight: Wave Computing, the company that bought the remains of MIPS, is now (after bancruptcy) spinning it off as a separate company, that is going to work under the name MIPS, holds the rights to the MIPS architecture, but is doing RISC-V?

[1] https://www.prnewswire.com/news-releases/wave-computing-and-...

bitwize · on March 8, 2021

Wave Computing is changing its name to MIPS, like how Tandy changed to RadioShack.

_dax6 · on March 8, 2021

That's what I got from it, but I have to say I don't understand the decision to throw away such a storied architecture as MIPS. I mean come on, the N64 runs on it!

Narishma · on March 8, 2021

PS1, PS2 and PSP also used MIPS processors and were each much more successful than the N64.

throwaway_6142 · on March 8, 2021

Atmel

nickysielicki · on March 8, 2021

This article from 2015 ("The Death of Moore’s Law Will Spur Innovation: As transistors stop shrinking, open-source hardware will have its day") is getting better and better with age.

https://spectrum.ieee.org/semiconductors/design/the-death-of...

KirillPanov · on March 9, 2021

Just the opposite, really.

As the cost of continuing Moore's Law approaches infinity yes, it will cease to continue. But not before all but one foundry has been driven out of business.

And then that one remaining foundry will dominate the entire chip industry.

Also: they'll treat startups like dogshit unless they use automatic place-and-route. So chip startups will be reduced to glorified FPGA jockeys. Hard to differentiate when all you're allowed to do is sling Verilog.

gautamcgoel · on March 8, 2021

This is huge. It looks like the only architectures widely-deployed in ten years will be x86, ARM, Power, and RISC-V (maybe also SPARC64 in Japan, although that's rare in the US).

dragontamer · on March 8, 2021

The big innovation in architectures is in the SIMD world.

AVX512 (x86 512-bit), SVE (ARM 512-bit), NVidia PTX / SASS (32x32-bit), AMD RDNA (32x32-bit), AMD CDNA (64x32-bit).

64-bit cores (aka: classic CPUs) are looking like a solved problem, and are becoming a commodity. SIMD compute however, remains an open question. NVidia probably leads today, but there seems to be plenty of room for smaller players.

Heck, one major company (AMD) is pushing 32x32-bit on one market (6xxx series) and 64x32-bit in another market (MI100 / Supercomputers).

gnufx · on March 8, 2021

The V in SVE is for vector, the S isn't for SIMD, and it's length-agnostic; I don't know how similar it is to the RISC-V vector extension. Think CDC, Cray, NEC, not AMD/Intel. I guess the recent innovation in that space is actual matrix multiplication instructions in CPUs.

dragontamer · on March 9, 2021

Some GPU assembly advancements...

* Bpermute and permute. (Pshufb is like permute, a gather operation. Bpermute is the opposite, like a scatter. Bpermute Doesn't exist on x86 yet)

* __shared__ memory crossbar: every simd unit can read, or write, to shared memory in parallel per clocktick. The crossbar can also broadcast 1-to-all each clocktick.

* Butterfly permute: the fundamental pattern in permutations for a variety of operations. Most noticably for scan, and FFTs. Butterfly networks are closely related to pext and pdep implementation (showing how common that particular permute is).

* 8+ way hyperthreads / SMT. GPUs have very bad latency, but very high SMT counteracts that problem well in practice.

* PCIe Atomics: perform those compare and swap negotiations over I/O, allowing tight CPU and GPU memory integration.

* Crazy RAM. 1000GBps on HBM2. 800GBps over GDDR6x thanks to 2-bits transferred per clock ticks.

* Crazy networks. AMD Infinity fabric pushes over 100GB. NVlink is 600GBps. A GPU network link has more bandwidth than a typical CPU's DDR4 RAM bandwidth.

* NVidia SASS has the craziest instruction set, the compiler figures out read / write hazards and publishes them in the SASS assembly itself. NVidias ISA decoder + assembler team is doing something crazy here, the likes I haven't seen in any other instruction set ever.

* "Ballot" instructions. Its... really hard to explain why these are useful. They just are, lol.

Just a few cool concepts I've seen in the GPU world recently. Sure, matrix multiplications get the headlines because of tensors / deep learning. But don't sleep on the obscure stuff.

tyingq · on March 8, 2021

Fujitsu seems like a leader for SIMD via their A64FX. I wonder if they will ever venture outside of the supercomputing niche.

dragontamer · on March 8, 2021

SVE is looking like a general-purpose ARM instruction set in the future.

I believe the Neoverse V-cores (high-performance) will have access to the SVE instructions for example. So the SVE-SIMD is not necessarily locked to Fujitsu (though Fujitsu's particular implementation is probably crazy good. HBM2 + 512-bit wide and #1 supercomputer in the world and all...)

londons_explore · on March 8, 2021

SIMD today is only really helpful with a few usecases. If you want to encode some video, decode some jpegs, or do a physics simulation quicker, it's really going to help. It won't boot Linux any quicker tho.

I suspect for consumer uses, SIMD is already used for nearly all the use cases it can be.

dragontamer · on March 8, 2021

Are you sure about that?

The original SIMD-papers in the 1980s show how to compile a Regex into a highly-parallel state machine and then "reduced" (aka: Scan / Prefix-operation: https://en.wikipedia.org/wiki/Prefix_sum).

A huge amount of operations, such as XML-whitespace removal (aka: SIMD Steam Compacting), Regular Expressions, and more, have been proven ~30 to 40 years ago to benefit from SIMD compute. Yet such libraries don't exist today yet.

SIMD compute is highly niche, and clearly today's population is overly focused on deep-learning... without even seeing the easy opportunities of XML parsing or simple regex yet. Further: additional opportunities are being discovered in O(n^2) operations: such as inner-join operations on your typical database.

Citations.

* For Regular Expressions: Read the 1986 paper "DATA PARALLEL ALGORITHMS". Its an easy read. Hillis / Steele are great writers. They even have the "impossible Linked List" parallelism figured out in there (granted: the nodes are located in such a way that the SIMD-computer can work with the nodes. But... if you had a memory-allocator that worked with their linked-list format, you could very well implement their pointer-jumping approach to SIMD-linked list traversal)

* For whitespace folding / removal, see http://www.cse.chalmers.se/~uffe/streamcompaction.pdf. They don't cite it as XML-whitespace removal, but it seems pretty obvious to me that it could be used for parallel whitespace removal in O(lg(n)) steps.

* Database SIMD: http://www.cs.columbia.edu/~kar/pubsk/simd.pdf . Various operations have been proven to be better on SIMD, including "mass binary search" (one binary search cannot be parallelized. But if you have 5000-binary searches operating in parallel, its HIGHLY efficient to execute all 5000 in a weird parallel manner, far faster than you might originally imagine).

----------

SIMD-cuckoo hashing, SIMD-skip lists, etc. etc. There's so many data-structures that haven't really been fleshed out on SIMD yet outside of research settings. They have been proven easy to implement and simple / clean to understand. They're just not widely known yet.

sujayakar · on March 8, 2021

I'm very interested in this space! I've been hacking on some open-source libraries around these ideas: rsdict [1], a SIMD-accelerated rank/select bitmap data structure, and arbolito [2], a SIMD-accelerated tiny trie.

For rsdict, the main idea is to use `pshufb` to implement querying a lookup table on a vector of integers and then use `psadbw` to horizontally sum the vector.

The arbolito code is a lot less fleshed out, but the main idea is to take a small trie and encode it into SIMD vectors. Laying out the nodes into a linear order, we'd have one vector that maintains a parent pointer (4 bits for 16 node trees in 128-bit vectors) and another vector with the incoming edge label.

Then, following the Teddy algorithm[3] (very similar to the Hillis/Stele state transition ideas too!), we can implement traversing the tree as a state machine, where each node in the trie has a bitmask, and the state transition is a parallel bitshift + shuffle of parent state to children states + bitwise AND. We can even reduce the circuit depth of this algorithm to `O(log depth)` by using successive squaring of the transition, like Hillis/Steele describe too.

I've put it on the backburner, but my main goal for arbolito would be to find a way to stitch together these "tiny tries" into a general purpose trie adaptively and get query performance competitive with a hashmap for integer keys. The ART paper[4] does similar stuff but without the SIMD tricks.

[1] https://github.com/sujayakar/rsdict

[2] https://github.com/sujayakar/arbolito

[3] https://github.com/jneem/teddy#teddy-1

[4] https://db.in.tum.de/~leis/papers/ART.pdf

dragontamer · on March 8, 2021

Cool stuff! I'll give it a lookover later.

A few years ago, I wrote AESRAND (https://github.com/dragontamer/AESRand). I managed to get some well-known programmers to look into it, and their advice helped me write some pretty neat SIMD-tricks. EX: I SIMD-implemented a 32-bit integer -> floating point [0.0, 1.0] operator, to convert the bitstream into floats. As well as integer-based nearly bias-free division / modulus free conversion into [0, WhateverInt] (such as D20 rolls). For 16-bit, 32-bit, and 64-bit integers (with less bias the more bits you supplied).

Unfortunately, I ran out of time and some work-related stuff came up. So I never really finished the experiments.

----------

My current home project is bump-allocator + semi-space garbage collection in SIMD for GPUs. As far as I can tell, both bump-allocation and semi-space garbage collection are easily SIMDified in an obvious manner. And since cudamalloc is fully synchronous, I wanted a more scalable, parallel solution to the GPU memory allocation problem.

sujayakar · on March 8, 2021

Very cool! Independent of the cool use of `aesenc` and `aesdec`, the features for skipping ahead in the random stream and forking a separate stream are awesome.

> My current home project is bump-allocator + semi-space garbage collection in SIMD for GPUs. As far as I can tell, both bump-allocation and semi-space garbage collection are easily SIMDified in an obvious manner. And since cudamalloc is fully synchronous, I wanted a more scalable, parallel solution to the GPU memory allocation problem.

This is a great idea. I wonder if we could speed up LuaJIT even more by SIMD accelerating the GC's mark and/or sweep phases...

If you're interested in more work in this area, a former coworker wrote a neat SPMD implementation of librsync [1]. And, if you haven't seen it, the talk on SwissTable [2] (Google's SIMD accelerated hash table) is excellent.

[1] https://github.com/dropbox/fast_rsync

[2] https://www.youtube.com/watch?v=ncHmEUmJZf4

dragontamer · on March 8, 2021

> Very cool! Independent of the cool use of `aesenc` and `aesdec`, the features for skipping ahead in the random stream and forking a separate stream are awesome.

Ah yeah, those features... I forgot about them until you mentioned them, lol.

I was thinking about 4x (512-bits per iteration) with enc(enc), enc(dec), dec(enc), and dec(dec) as the four 128-bit results (going from 256-bits per iteration to 512-bits per iteration, with only 3-more instructions). I don't think I ever tested that...

But honestly, the thing that really made me stop playing with AESRAND was discovering multiply-bitreverse-multiply random number generators (still unpublished... just sitting in a directory in my home computer).

Bit-reverse is single-cycle on GPUs (NVidia and AMD), and perfectly fixes the "multiplication only randomizes the top bits" problem.

Bit-reverse is unimplemented on x86 for some reason, but bswap64() is good enough. Since bswap64() and multiply64-bit are both implemented really fast on x86-64-bit, a multiply-bswap64-multiply generator probably is fastest for typical x86 code (since there are penalties for going between x86 64-bit registers and AVX 256-bit registers).

---------

The key is that multiplying by an odd number (bottom-bit == 1) results in a fully invertible (aka: no information loss) operation.

So multiply-bitreverse-multiply is a 1-to-1 bijection in the 64-bit integer space: all 64-bit integers have a singular, UNIQUE multiply-bitreverse-multiply analog. (with multiply-bitreverse-multiply(0) == 0 being the one edge case where things don't really workout. An XOR or ADD instruction might fix that problem...).

---------

> This is a great idea. I wonder if we could speed up LuaJIT even more by SIMD accelerating the GC's mark and/or sweep phases...

Mark and Sweep looks hard to SIMD-accelerate in my opinion. At least, harder than a bump-allocator. I'm not entirely sure how a SIMD-accelerated traversal of the heap is even supposed to look like (aka: simd-malloc() looks pretty hard).

If all allocs are prefix-sum'd across the SIMD-units (ex: malloc ({1, 4, 5, 1, 2, 3, 20, 10}) == return (memory + {0, 1, 5, 10, 11, 13, 14, 34, 44})... for a bump-allocator like strategy... its clear to me that such a mark/sweep allocator would have fragmentation issues. But I guess it would work...

Semispace collectors innately fix the fragmentation problem. So prefix-sum(size+header) allocators are just simple and obvious.

--------

On the "free" side of Mark/sweep... I think the Mark-and-sweep itself can be implemented in GPU-SIMD thanks to easy gather/scatter on GPUs.

However, because gather/scatter is missing (scatter is missing from AVX2), or slow (AVX512 doesn't seem to implement a very efficient vgather or vscatter), I'm not sure if SIMD on CPU-based Mark/Sweep would be a big advantage.

------------

Yup yup. Semispace GC or bust, IMO anyway for the SIMD-world. Maybe mark-compact (since mark-compact would also fix the fragmentation issue).

The mark-phase is just breadth-first-search, which seems like a doable SIMD-pattern with the right data-structure (breadth-first is easier to parallelize than depth-first)

sujayakar · on March 9, 2021

> Bit-reverse is unimplemented on x86 for some reason, but bswap64() is good enough.

You totally nerd-sniped me! I implemented a basic "reverse 128-bit SIMD register" routine with `packed_simd` in Rust. The ideas to process 4 bits a time:

    let lo_nibbles = input & u8x16::splat(0x0F);
    let hi_nibbles = input >> 4;

Then, we can use `pshufb` to implement a lookup table for reversing each vector of nibbles.

    let lut = u8x16::new(0b0000, 0b1000, 0b0100, 0b1100, 0b0010, 0b1010, 0b0110, 0b1110, 0b0001, 0b1001, 0b0101, 0b1101, 0b0011, 0b1011, 0b0111, 0b1111);
    let lo_reversed = lut.shuffle1_dyn(lo_nibbles);
    let hi_reversed = lut.shuffle1_dyn(hi_nibbles);

Now that each nibble is reversed, we can flip the lo and hi nibbles within a byte when reassembling our u8x16.

    let bytes_reversed = (lo_reversed << 4) | hi_reversed;

Then, we can shuffle the bytes to get the final order. We could use a different permutation for simulating reversing f64s in a f64x2, too.

    let rev_bytes = u8x16::new(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0);
    return bytes_reversed.shuffle1_dyn(rev_bytes);

Looking at the disassembly, if we assumed our LUT and shuffle vectors are already in registers, the core shuffle should be pretty fast. (I haven't actually benchmarked this or run it through llvm-mca, though :-p)

    __ZN8arbolito7reverse17he044c5155cfe877bE:
    push    rbp
    mov     rbp, rsp
    movdqa  xmm1, xmmword ptr [rip + 557012]
    movdqa  xmm2, xmm0
    pand    xmm2, xmm1
    psrlw   xmm0, 4
    pand    xmm0, xmm1
    movdqa  xmm1, xmmword ptr [rip + 557003]
    movdqa  xmm3, xmm1
    pshufb  xmm3, xmm2
    pshufb  xmm1, xmm0
    psllw   xmm3, 4
    pand    xmm3, xmmword ptr [rip + 556992]
    por     xmm3, xmm1
    pshufb  xmm3, xmmword ptr [rip + 556995]
    movdqa  xmmword ptr [rdi], xmm3
    pop     rbp
    ret

And, this does a full bitstring reversal, not just reversing the bytes like `bswap64`, right?

> The key is that multiplying by an odd number (bottom-bit == 1) results in a fully invertible (aka: no information loss) operation.

This is neat! How do you choose odd numbers so the final generated numbers are high quality?

dragontamer · on March 9, 2021

> This is neat! How do you choose odd numbers so the final generated numbers are high quality?

This was years ago, so I forget the details. But it was along the lines of...

    uint32_t hash(uint32_t seed, uint32_t k1, uint32_t k2){
        return (brev(seed*k1) * k2);
    }

    uint32_t evaluate(uint32_t seed, uint32_t k1, uint32_t k2){
        return popcnt(hash(seed, k1, k2) ^ seed);
    }

The goal is to find the values of k1 and k2 that resulted in an evaluate(seed, k1, k2) close to 16-bits (aka: 50% of bits change, the definition of "avalanche condition"). There's probably some statistical test I could have done that'd be better, but GPUs have single-cycle popcount and single-cycle XOR.

I forgot exactly which search methods I used, but note that a Vega64 GPU easily reaches 10 Trillion-multiplies / second, so you can exhaustively search a 32-bit space in a ~millisecond, and a 40-bit space in just a couple of seconds.

You can therefore search the values of k1 and k2 ~8-bits at a time every few seconds. From there, plug-and-play your favorite search algorithm (genetic algorithms? Gradient descent? Random search? Simulated annealing?).

--------

After that, I'd of course run it through PractRand or BigCrush (and other tests). In all honesty: random numbers (with bottom bit set to 1) from /dev/urandom are already really good.

---------

Exhausting the 64-bit space seems unreasonable however. I was researching FNV-hashes (another multiplication-based hash), trying to understand how they chose their constants.

my123 · on March 8, 2021

https://github.com/simdjson/simdjson as an example that fits in the "outside-of-a-conventional-SIMD-workload" mold.

skybrian · on March 8, 2021

It seems like there are a wide variety of ways to serialize and deserialize data, their performance sometimes varies by orders of magnitude, and the slow code persists because it doesn’t matter enough to optimize compared to other virtues like convenience and maintainability.

The key seems to be figuring out how to get good (not the best) performance when you mostly care about other things?

Machine learning itself seems like an example of throwing hardware at problems to try to improve the state of the art, to the point where it becomes so expensive that they have to think about performance more.

dragontamer · on March 8, 2021

SIMD-compute is a totally different model of compute than what most programmers are familiar with.

That's the biggest problem. If you write optimal SIMD-code, no one else in your team will understand it. Since we have so much compute these days (to the point where O(n^2) scanf parsers are all over the place), its become increasingly obvious that few modern programmers care about performance at all.

Nonetheless, the more and more I study SIMD-compute, the more I realize that these expert programmers have figured out a ton of good and fast solutions to a wide-variety of problems... decades ago and then somehow forgotten until recently.

Seriously: that Data Parallel Algorithms paper is just WTF to me. Linked list traversal (scan-reduced sum from a linked list in SIMD-parallel), Regular Expressions and more.

--------

Then I look at the GPU-graphics guys, and they're doing like BVH tree traversal in parallel so that their raytracers work.

Its like "Yeah, Raytracing is clearly a parallel operation cause GPUs can do it". So I look it up and wtf? Its not easy. Someone really thought things through. Its non-obvious how they managed to get a recursive / sequential operation to operate in highly parallel SIMD operations while avoiding branch divergence issues.

Really: think about it: Raytracing is effectively:

    If(ray hit object) recursively bounce ray.

How the hell did they make that parallel? A combination of stream-compaction and very intelligent data-structures, as well as a set of new SIMD-assembly instructions to cover some obscure cases.

There's some really intelligent stuff going on in the SIMD-compute world, that clearly applies beyond just the machine-learning crowd.

lmm · on March 9, 2021

> A huge amount of operations, such as XML-whitespace removal (aka: SIMD Steam Compacting), Regular Expressions, and more, have been proven ~30 to 40 years ago to benefit from SIMD compute. Yet such libraries don't exist today yet.

That's exactly why I don't believe it's ever going to happen. If these things could actually be useful in practice, surely someone would have done it already.

m00x · on March 8, 2021

This comment seems a bit shortsighted. GPUs and TPUs are SIMD and ML models are increasingly being used in consumer hardware. Video cards are selling out so fast that there's months worth of backorders.

SIMD processors are being put in self driving cars, robots with vision, doorbell cameras, drones, etc. We're only at the beginning of SIMD use-cases.

rst · on March 8, 2021

Depends where you look. Hard to see IBM's z/Architecture dying out in that timeframe (the latest branding for S/360-derived mainframes), for example, and the embedded space is likely to remain an odd bestiary for quite some time.

monocasa · on March 8, 2021

Embedded is going to become way less of a bestiary, at least in the five digit gate count RISC space.

ARC, Xtensa, V850, arguably MIPS, etc. all worked in the "we're cheaper to license than ARM, and will let you modify the core more than ARM will" space. I'm not sure how they maintain that value add when compared to lower end RISC-V cores. I expect half of them to fold, and half of them to turn into consulting ships for RISC-V and fabless design in general.

projektfu · on March 8, 2021

Isn't z/Architecture just emulated on top of POWER? That's been my impression for a while.

dfox · on March 8, 2021

You probably mean "TIMI" which is the user-visible ISA of IBM's "midrange" systems (ie. AS/400 or System/i) which was from the start meant as virtual machine ISA that is then mostly AOT transpiled into whatever hardware ISA OS/400 or i5/OS runs on. z/Architecture (S/360, ESA/390, what have you...) is distinct from that and distinct from PowerPC. Modern POWER and z/Architecture CPUs and machines are somewhat similar when you look at the execution units and overall system design, but the ISA is completely different and even the performance profile of the mostly similar CPU is different (z/Architecture is "uber-CISC" with instructions like "calculate sha256 of this megabyte of memory").

_ugfj · on March 8, 2021

I learned Z80 assembly in 1987, x86 assembly somewhere '91-92 can't exactly remember but it was in '94 when I met IBM Assembler (yes they called Assembler language which is also confusing) and I was like "what is this sorcery where assembly has an instruction to insert into a tree".

monocasa · on March 8, 2021

No, they're probably talking about how modern z/Arch and POWER cores share a lot of HDL source these days.

projektfu · on March 9, 2021

Yeah, I figured that most of the previously microcoded CISC instructions were basically software at this point.

monocasa · on March 9, 2021

What's probably closest to the truth is that they have a very similar set of micro ops that they both translate to internally.

monocasa · on March 8, 2021

They share RTL, but it's not just a POWER core with a z/Arch frontend.

the8472 · on March 8, 2021

+ many proprietary GPU instruction sets

star-trek-fleet · on March 8, 2021

This is indeed huge.

Should open the path for the modernization of the auto chip industry, according to [1], 80% of ADAS chips are based on mips.

[1] https://www.globenewswire.com/news-release/2019/02/28/174460...

justin66 · on March 8, 2021

Why do you assume Power will last another 10 years?

sneak · on March 8, 2021

What is the SPARC64 use case in Japan? Supercomputing?

tyingq · on March 8, 2021

The "not ARM based" MCU market might be interesting to watch as well. Not just for RISC-V, but also as ARM MCUs continue to drop in power needs and unit cost.

steviedotboston · on March 8, 2021

so basically the same architecutes that have been widely deployed for the past 20+ years?

tyingq · on March 8, 2021

The MIPS name was originally was an acronym for "Microprocessor without Interlocked Pipeline Stages". The RISC-V docs that I've skimmed seem to show quite a lot more hardware pipeline interlocking than the last gen MIPS processors. So the name is a bit funny now.

monocasa · on March 8, 2021

It's a lot harder to have pipeline stages and no interlocks when you don't have delay slots.

notacoward · on March 9, 2021

A few jobs ago, the company I worked for based their own custom processor on a MIPS core. Why MIPS? The answer I got when I asked was that it was the only affordable option. ARM in particular was called out as beyond reach financially. Years later, long after that company was gone, RISC-V came in at an even lower price point. AFAICT there's no need to look for other reasons behind this news.

Andrex · on March 8, 2021

My historical skepticism on the acceptance and proliferation of RISC-V looks more antiquated by the day. No real dog in the fight, but I would love to see this take off like ARM did.

ThinkBeat · on March 9, 2021

I started assembly programming on early x86 chips. Then I landed a job programming MIPS chips.

They were so much easier to work with in assembler than x86.

I was quite impressed.

Then came the SGI Indy for me. Good times.

Sorry to see them go.

p1mrx · on March 8, 2021

Mirror: https://archive.is/S1s80

yaantc · on March 9, 2021

It's very good move. First, it's for embedded: if you're not designing system on chip it won't matter to you.

But for those who design SoC, and who needed embedded CPU of intermediate power, it's very good news. ARM is expensive here (it's considered cheap compared to Intel, the embedded world is different). The RISC V newcomers are interesting but... new, and it always get a bit of time to bring solutions to maturity, which matters in embedded. And if ARM owns the high-end (where a design must be co-optimized for the latest advanced nodes to really shine, which is very labor intensive and costly), for the mid-range it's much more open.

MIPS had a good mid-range design with the I-7200. Their problem was that the old MIPS ISA was not dense enough (larger I cache, larger Flash footprint) compared to the competition, and their compact versions not good enough. So they designed for the I-7200 a new ISA, nanoMIPS, which has "MIPS" in its name but is completely different. And guess what: nobody cared. It became another proprietary ISA, only supported by MIPS own GCC version.

But still, the design was good, and in particular the LLC/coherency support is much more mature than what many newcomers offer today. Which shouldn't be a surprise, as it's the result of a long line of (good) mid-range CPUs.

By evolving their design to RISC-V, MIPS will have one of (if not the) best mid-range CPU/cluster IP in the market soon for quality vs price --- depending on how fast the competition move, they're definitely not sitting still! And RISC V will solve the toolchain/tools support nanoMIPS has. If there had been such a RISC I-7200 equivalent a few year ago, I may have used it.

So very nice, and I look forward for more competition in the embedded mid-range CPU IP market soon.

pabs3 · on March 9, 2021

There are a couple of vendors of multi-architecture CPUs where the native architecture is MIPSish and the other supported arches are MIPS, ARM, RISC-V & x86. Tachyum is one of them, here is another:

https://www.zhihu.com/question/414069789

dsand · on March 10, 2021

That is a very interesting article. For how they aim to run existing software by binary translations to an ISA optimized for such binary translations. And how much they dread possibly being cut off from foreign foundries and ARM and x86 chips.

thr0w__4w4y · on March 10, 2021

I consulted with MIPS and ARM many years ago - pre Y2K (background is microprocessor design and firmware). ARM was a little formal but easy to work with, cooperative and supportive (they opened doors and provided everything I asked for). Working with MIPS was a nightmare, every step of the way. Weren't responsive, difficult to work with, engineers were stubborn, etc.

I realize these are generalizations and I'm a sample size of one. But I was plugged in at a pretty high technical level with both companies, and I remember telling my wife at the time that I thought ARM would skyrocket and MIPS would be unable to get out of its own way.

Glad I bought a lot of ARM stock before most people knew about them.

throwaway81523 · on March 8, 2021

Actual source: https://www.eejournal.com/article/wait-what-mips-becomes-ris...

dang · on March 8, 2021

Right. We changed the URL from https://tuxphones.com/mips-joins-risc-v-open-hardware-standa..., which points to that. Thanks!

Macha · on March 8, 2021

May have been a mistake, HN has killed the original source.

pacman2 · on March 8, 2021

Loongson still does MIPS. They are a little bit vaporware, even hard to get in China.

https://en.wikipedia.org/wiki/Loongson

arnd · on March 9, 2021

The next "loongarch" generation was already announced to move away from mips as the underlying ISA but instead allow running mips, arm64, risc-v, and x86 code in hardware assisted emulation.

pabs3 · on March 9, 2021

A link about Loongarch:

https://www.zhihu.com/question/414069789

lallysingh · on March 8, 2021

I guess that they're going to ship RISC-V CPUs? Makes sense. Do they still have design chops for making fast implementations?

ChuckMcM · on March 8, 2021

It was unclear if "v8" of the MIPS architecture is a re-branded RISC-V or if v8 MIPS is a combined RISC-V + MIPS or what.

esturk · on March 8, 2021

"In this context, the “8th generation” refers to seven generations of the traditional MIPS architecture, followed by an upcoming RISC-V design. It sounds like the company is implying that this is a smooth transition with some level of compatibility between the old and the new. It isn’t. It’s a clean break as the company switches from the old CPU design, that it owned, to a new one that’s in the public domain."

arnd · on March 9, 2021

Mips R6 (Warrior m62xx/i6400/i6500/p6600) was already somewhat incompatible with R5 and earlier, the seventh generation nanomips i7200 was incompatible with that again.

cable2600 · on March 9, 2021

I would like to see AmigaOS ported to a RISC-V platform for new RISC based Amigas, it would be cheaper than the PowerPC Amigas.

synergy20 · on March 8, 2021

It could preempt RISC-V long time ago by doing this. I hope it's not too late.

MIPS is still used in routers and set-top-boxes, but the steam is running out quickly, nearly all routers/set-top-box new design are now using ARMs already. There is a last hope though.

buescher · on March 8, 2021

"Development of the MIPS processor architecture has now stopped"

Is anyone still developing SPARC?

tyingq · on March 8, 2021

Fujitsu for general purpose servers, and Atmel (maybe others too?) for rad-hardened Sparc.

zokier · on March 8, 2021

> Atmel

Microchip these days, Atmel got acquired few years back..

The packaging looks cool of those https://www.microchip.com/wwwproducts/en/AT697F

Gracana · on March 8, 2021

That's how it's packaged prior to trimming and lead forming, which looks like this:

https://www.youtube.com/watch?v=4CfEN5R13w4