Compilers are written as a pipeline: in particular, instruction selection and register allocation are different phases. GHC, for instance, uses maximal munch for instruction selection and a variety of register allocators. However, on x86-64 (for instance), register allocation constrains the particular instruction encodings, which affects the cost of some instructions.
Some infelicities:
In many cases, one can omit the REX prefix. However, this is not possible for
r8
-r15
.One can use a two-byte VEX prefix for certain registers, but others require a 3-byte VEX prefix.
The
test
instruction has a special encoding forrax
when comparing to an integer. So encodingtest
forrax
specifically saves one byte.The
loop
instruction decrementsrcx
and performs a (short) conditional jump in one instruction.There are various constraints on register encodings when addressing. In particular,
rbp
andr13
must include a displacement andrsp
andr12
always need a SIB byte. Thus if register allocation choosesrbp
,r13
,rsp
, orr12
then some encodings become longer.
All of these affect instruction length and thus code size, and in turn probably affect cache performance.