Ever tried writing machine code (not assembly) by hand? I used to do that a few decades back for an 8-bit microprocessor. I am still looking for good resources on how to do that for a modern processor.
You look up the opcodes for each assembler instructions in the ISA specifications, like https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-.... Information about the machine code format starts at "2.2 Base Instruction Formats". The actual opcodes you can find in chapter 9,"RV32/64G Instruction Set Listings".
But be aware that some assembler instructions (called e.g. pseudo instructions) generate more than one.
Oh man you gave me flashbacks. The TRS-80 Model II's OS, TRSDOS-II, had a built-in debugger that was little more than a monitor. You could step through instructions, examine and write to memory, set breakpoints to absolute memory locations, and that was it. I remember hand-assembling tiny Z80 programs in that thing and jumping into them, just to test my understanding of how machine code programs worked and how the computer executed them, and being super thrilled when I could get an A to appear somewhere on the screen or something.
The machine had a much more complete assembly language programming toolkit which I also used to write more sophisticated programs, employing this debugger to examine them. But I felt like I'd "cracked the code" of the computer when I plugged hex numbers into RAM and then ran them straight from there.
Most CPU ISA documentation should give you the opcodes that correspond to instruction mnemonics. You may have to plug in your own operands (registers, etc.) into bit fields in the instruction encoding. If you're serious about hand-assembling to begin with this should be no problem.
I’ve told this before, but it’s so amazing I like to give Tim credit. I worked on Visual Basic at Microsoft with Tim Paterson, who also created the operating system that became MS-DOS. He worked on code generation and debugged by looking at the opcodes in a hex dump. Assembly was too slow for him him.
Start at section A5 describing the encoding. The instruction set is very much designed for clean decode, so instructions are grouped by bit pattern; every instruction in the manual has its bit pattern described.
Very much a "but why?" situation, since translation from assembly to machine code is so easily automatable and doing it by hand adds so little value.
SubX seems like an assembly language itself following a subset of x86 32-bit instructions. Would looking into this help me understand how to translate from assembly to machine code manually? Thanks.
SubX is a weird thing (I built it) that is somewhere between machine code and Assembly language. You have to type in the opcodes directly, which people typically associate with machine code. But it smooths some aspects of programming in machine code. You'll get nice errors if you accidentally write invalid machine code, it won't just go off and run data as code or something like that.
I'd be happy to support you if you choose to try it out! Ask as many questions as you like.
No, sorry. Honestly I spent a long time going to the source of the 3-volume Intel manual. (There's link to them as well in my Readme.) I think that's really what you need to do for machine code, if you're not using Assembly or Assembly-ish that has done that work for you. But then any Assembly language will have its own manual you need to bone up on.. That's mostly why I built SubX: the manual is like 10 pages, and I distilled down the parts of the Intel manual you need to know. But yeah, only for 32-bit. I always found x64 very hacky with the register bits split up between bytes and whatnot. 32-bit is a legitimately nice ergonomic machine.
arm is the nicest instruction encoding you can get actual hardware for (not thumb or aarch64). risc-v is pretty okay at the assembly level but the instruction encoding is almost deliberately sadistic. amd64 isn't too terrible but not nearly as nice as arm
older official arm documentation is a lot better than recent, which is very poor quality (though still pretty reliable.) oldnewthing and azeria-labs have good tutorials, though she got some of the condition flags wrong