Hacker News new | past | comments | ask | show | jobs | submit login

Ever tried writing machine code (not assembly) by hand? I used to do that a few decades back for an 8-bit microprocessor. I am still looking for good resources on how to do that for a modern processor.



You look up the opcodes for each assembler instructions in the ISA specifications, like https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-.... Information about the machine code format starts at "2.2 Base Instruction Formats". The actual opcodes you can find in chapter 9,"RV32/64G Instruction Set Listings". But be aware that some assembler instructions (called e.g. pseudo instructions) generate more than one.

The linked document is an old one, current ones are at https://riscv.org/technical/specifications/


Awesome! Thanks!



Oh man you gave me flashbacks. The TRS-80 Model II's OS, TRSDOS-II, had a built-in debugger that was little more than a monitor. You could step through instructions, examine and write to memory, set breakpoints to absolute memory locations, and that was it. I remember hand-assembling tiny Z80 programs in that thing and jumping into them, just to test my understanding of how machine code programs worked and how the computer executed them, and being super thrilled when I could get an A to appear somewhere on the screen or something.

The machine had a much more complete assembly language programming toolkit which I also used to write more sophisticated programs, employing this debugger to examine them. But I felt like I'd "cracked the code" of the computer when I plugged hex numbers into RAM and then ran them straight from there.

Most CPU ISA documentation should give you the opcodes that correspond to instruction mnemonics. You may have to plug in your own operands (registers, etc.) into bit fields in the instruction encoding. If you're serious about hand-assembling to begin with this should be no problem.


I’ve told this before, but it’s so amazing I like to give Tim credit. I worked on Visual Basic at Microsoft with Tim Paterson, who also created the operating system that became MS-DOS. He worked on code generation and debugged by looking at the opcodes in a hex dump. Assembly was too slow for him him.


"ARM Architecture reference manual": https://documentation-service.arm.com/static/5f8daeb7f86e165... (assuming that link works)

Start at section A5 describing the encoding. The instruction set is very much designed for clean decode, so instructions are grouped by bit pattern; every instruction in the manual has its bit pattern described.

Very much a "but why?" situation, since translation from assembly to machine code is so easily automatable and doing it by hand adds so little value.


>> but why?

Agreed. More of a curiosity for me from learning and research purposes.



Also, running machine code directly can be done like this:

https://github.com/eterps/loader/blob/master/syscall.nim


This is cool! This is how I used to do also by embedding hand-written machine code within a BASIC program and calling it to run natively.


> This is how I used to do also by embedding hand-written machine code within a BASIC program and calling it to run natively

Me too; that's also the reason why I wanted that possibility back.


SubX seems like an assembly language itself following a subset of x86 32-bit instructions. Would looking into this help me understand how to translate from assembly to machine code manually? Thanks.


SubX is a weird thing (I built it) that is somewhere between machine code and Assembly language. You have to type in the opcodes directly, which people typically associate with machine code. But it smooths some aspects of programming in machine code. You'll get nice errors if you accidentally write invalid machine code, it won't just go off and run data as code or something like that.

I'd be happy to support you if you choose to try it out! Ask as many questions as you like.

Even if you choose not to, you might like the cheatsheet in the repo (from https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x8...)


Thanks! I understand better now.

And that cheat sheet PDF is cool, exactly what I was looking for. Any chance you are aware of something similar for x64? Thanks.


No, sorry. Honestly I spent a long time going to the source of the 3-volume Intel manual. (There's link to them as well in my Readme.) I think that's really what you need to do for machine code, if you're not using Assembly or Assembly-ish that has done that work for you. But then any Assembly language will have its own manual you need to bone up on.. That's mostly why I built SubX: the manual is like 10 pages, and I distilled down the parts of the Intel manual you need to know. But yeah, only for 32-bit. I always found x64 very hacky with the register bits split up between bytes and whatnot. 32-bit is a legitimately nice ergonomic machine.


Thanks for your efforts creating SubX. Now I understand the motivations behind that better.


arm is the nicest instruction encoding you can get actual hardware for (not thumb or aarch64). risc-v is pretty okay at the assembly level but the instruction encoding is almost deliberately sadistic. amd64 isn't too terrible but not nearly as nice as arm

older official arm documentation is a lot better than recent, which is very poor quality (though still pretty reliable.) oldnewthing and azeria-labs have good tutorials, though she got some of the condition flags wrong




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: