Functions are compiled to machine by a convention of jumps and
registers; one calls a function by jumping to its
location in memory. These jumps are relative and particular functions (say,
malloc) may be loaded at different memory locations, so the machine code for
a function cannot be pinned down and in fact is contingent on every function
that it calls.
Storing compiled code persistently (and then reassembling it for the processor) is involved (see Ian Lance Taylor's blog series for details), but these details are unavoidable for language implementers, particularly for JIT compilation.
JIT compilation is left out of compiler textbooks but the translation from assembly to machine is not trivial.
On OS X or Linux, to assemble a
call malloc instruction:
Allocate memory for the assembled function.
libc.so to get a handle and pass that to
dlsym to get
a pointer to
malloc. This relies on the operating system to load compiled
code, and on us knowing that
malloc is defined in
Calculate the relative offset between
malloc and the
and use this to assemble.
For full details see my own simple JIT here.
Thus, a compiled function which calls other functions cannot be canonicalized into machine code but must depend on the locations in memory of the called functions and its own location in memory. In practice, getting the locations of other functions this depends on system facilities, including the file system. Thus, compiling on one machine and distributing to others is quite fraught with existing operating systems and toolchains.
Suppose one has an assembly function:
ncdf: ... call erf ...
erf, which is defined in
libm on Linux. We can assemble
to produce an object file and then make this into a shared library, directing
the linker to
libm (by passing
-lm on the command line). Notably, this means that
call erf isn't just an
instruction that stands on its own: it's also an expectation that pieces will be
in order at execution time so that the expected
erf code from
libm will be
called (including an expectation that
-lm be passed on the command-line when
dealing with the object file).
The pipeline model of compilers taking one functions to assembly and then
machine code is inaccurate. Indeed, making an executable often involves a build
system; cabalized Haskell code, for instance, uses the
build system to bundle the compiler's
call instructions with appropriate linker flags.
Second, JIT compilation is unduly left out of compilers material; implementing a JIT differs substantially from the experience of compiling to assembly and invoking assemblers/linkers to produce an executable.