Hacker News new | past | comments | ask | show | jobs | submit login

Don't fancy x86 addressing modes provide most of those multiplications and offsets with very little IPC penalty?



Yeah, this should be roughly the same overhead as an ADD:

    LEA rDest, [rBase + 8*rPtr]
(The "load effective address" instruction computes an effective address like a load or store would, but just gives the address without doing a memory access.)


AIUI mov supports these things directly[0] and if I read the instruction tables correctly then at least on skylake the latency/throughput is the same for all addressing modes[1]

[0] http://www.c-jump.com/CIS77/ASM/Addressing/lecture.html#R77_... [1] https://www.agner.org/optimize/instruction_tables.pdf (page 238)


Decompression isn't the problem, compression is. Compression is just a mov. Now we need additional shifts.


Also we'll probably lose some cache benefits from compression due to larger alignment.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: