Compilers are written as a pipeline: in particular, instruction selection and register allocation are different phases. GHC, for instance, uses maximal munch for instruction selection and a variety of register allocators. However, on x86-64 (for instance), register allocation constrains the particular instruction encodings, which affects the cost of some instructions.
In many cases, one can omit the REX prefix. However, this is not possible for
One can use a two-byte VEX prefix for certain registers, but others require a 3-byte VEX prefix.
test instruction has
a special encoding for
rax when comparing to an integer. So encoding
rax specifically saves one byte.
rcx and performs a (short) conditional jump in one instruction.
There are various constraints on register encodings when addressing. In particular,
r13 must include a displacement and
r12 always need a SIB byte. Thus if register allocation chooses
r12 then some encodings become longer.
All of these affect instruction length and thus code size, and in turn probably affect cache performance.