Compilers are written as a pipeline: in particular, instruction selection and register allocation are different phases. GHC, for instance, uses maximal munch for instruction selection and a variety of register allocators. However, on x86-64 (for instance), register allocation constrains the particular instruction encodings, which affects the cost of some instructions.

Some infelicities:

• In many cases, one can omit the REX prefix. However, this is not possible for r8-r15.

• One can use a two-byte VEX prefix for certain registers, but others require a 3-byte VEX prefix.

• The test instruction has a special encoding for rax when comparing to an integer. So encoding test for rax specifically saves one byte.

• The loop instruction decrements rcx and performs a (short) conditional jump in one instruction.

• There are various constraints on register encodings when addressing. In particular, rbp and r13 must include a displacement and rsp and r12 always need a SIB byte. Thus if register allocation chooses rbp, r13, rsp, or r12 then some encodings become longer.

All of these affect instruction length and thus code size, and in turn probably affect cache performance.