Writing a practical JIT is somewhat complicated and in fact depends on the
assembler; here I present a full example in Haskell. Notably this JIT/assembler
is capable of calling procedures in system libraries (i.e. malloc
, free
)
First, we need to be able to allocate executable memory. Eli Bendersky has an explanation on how to do so in C; from there it is a straightforward translation to Haskell using the C FFI.
(This uses hsc2hs
; it handles
C macro constants better than c2hs
)
module Hs.FFI ( bsFp
, allocNear
, allocExec
, finish
) where
import Data.Bits ((.|.))
import Foreign.C.Types (CInt (..), CSize (..), CChar)
import Foreign.Ptr (FunPtr, IntPtr (..), castPtrToFunPtr, Ptr, intPtrToPtr, nullPtr)
import qualified Data.ByteString as BS
import qualified Data.ByteString.Unsafe as BS
import System.Posix.Types (COff (..))
#include <sys/mman.h>
allocNear :: Int -> CSize -> IO (Ptr a)
allocNear i sz =
mmap (intPtrToPtr (IntPtr$i+61024104)) sz #{const PROT_WRITE} (#{const MAP_PRIVATE} .|. #{const MAP_ANONYMOUS}) (-1) 0
-- libc.so is 2.1MB, libm is 918kB
allocExec :: CSize -> IO (Ptr a)
allocExec sz =
mmap nullPtr sz #{const PROT_WRITE} (#{const MAP_32BIT} .|. #{const MAP_PRIVATE} .|. #{const MAP_ANONYMOUS}) (-1) 0
finish :: BS.ByteString -> Ptr CChar -> IO (FunPtr a)
finish bs fAt = BS.unsafeUseAsCStringLen bs $ \(b, sz) -> do
let sz' = fromIntegral sz
_ <- memcpy fAt b sz'
_ <- mprotect fAt sz' #{const PROT_EXEC}
pure (castPtrToFunPtr fAt)
bsFp :: BS.ByteString -> IO (FunPtr a, CSize)
bsFp bs = BS.unsafeUseAsCStringLen bs $ \(bytes, sz) -> do
let sz' = fromIntegral sz
fAt <- {-# SCC "mmap" #-} allocExec sz'
_ <- {-# SCC "memcpy" #-} memcpy fAt bytes sz'
_ <- {-# SCC "mprotect" #-} mprotect fAt sz' #{const PROT_EXEC}
pure (castPtrToFunPtr fAt, sz')
foreign import ccall mmap :: Ptr a -> CSize -> CInt -> CInt -> CInt -> COff -> IO (Ptr a)
foreign import ccall mprotect :: Ptr a -> CSize -> CInt -> IO CInt
foreign import ccall memcpy :: Ptr a -> Ptr a -> CSize -> IO (Ptr a)
This is a translation of the abovelinked blog post except for allocNear
. This
calls Linux's mmap
, requesting that allocated memory be near some address.
As we shall see, this is useful when assembling call
instructions to system
libraries.
The assembler design is standard; it has one pass to calculate addresses of labels and another to translates to machine code. Avoid the tardis monad approach; it is slow and we need an initial pass to calculate code size for allocation anyway.
The JIT assembler is somewhat intricate, but trivial:
Do a first pass over all instructions, noting the relative offset of any labels, and calculating the total length of the machine code. Also note any
call
s to libraries, e.g.malloc
.Open the system libraries if necessary.
malloc
, for instance, is inlibc.so
, so we usedlopen
to get a handle and then usedlsym
to get a pointer tomalloc
.Allocate memory using
allocNear
; request that the memory be nearmalloc
and use the size that we computed with the first pass. This function returns the absolute address of what will be the function itself.Assemble to machine code in a second pass. Since we have the absolute address of our own procedure and the absolute address of
malloc
(returned bydlsym
), we can compute the relative offset at anycall
instruction.We also fill in
jmp
s with relative addresses in this step.
To get addresses to malloc
etc. using dlopen
, dlsym
:
{-# LANGUAGE OverloadedStrings #-}
module Sys.DL ( libc, mem' ) where
import Data.Functor (($>))
import Foreign.C.Types (CSize)
import Foreign.Ptr (FunPtr, IntPtr (..), Ptr, castFunPtrToPtr, ptrToIntPtr)
import System.Posix.DynamicLinker.ByteString (DL, RTLDFlags (RTLD_LAZY), dlclose, dlopen, dlsym)
#include <gnu/lib-names.h>
mem' :: IO (Int, Int)
mem' = do {(m,f) <- mem; pure (g m, g f)}
where g = ((IntPtr i) -> i) . ptrToIntPtr . castFunPtrToPtr
mem :: IO (FunPtr (CSize -> IO (Ptr a)), FunPtr (Ptr a -> IO ()))
mem = do {c <- libc; m <- dlsym c "malloc"; f <- dlsym c "free"; dlclose c$>(m, f)}
ll p = dlopen p [RTLD_LAZY]
libc :: IO DL
libc = ll {# const LIBC_SO #}
(This uses c2hs
; hsc2hs
doesn't
work here)
Now consider:
mkIx :: Int -> [X86 X86Reg FX86Reg a] -> (Int, M.Map Label Int)
mkIx ix (Label _ l:asms) = second (M.insert l ix) $ mkIx ix asms
...
This is the first pass described above, it notes offsets of labels and gives the length of the machine code.
Then:
asm :: Int -> (Int, Maybe (Int, Int), M.Map Label Int) -> [X86 X86Reg FX86Reg a] -> [Word8]
asm ix st (J _ l:asms) =
let lIx = get l st
instr = let offs = lIx-ix-5 in 0xe9:cd (fromIntegral offs :: Int32)
in (instr ++) $ asm (ix+5) st asms
asm ix st@(self, Just (m, _), _) (Call _ Malloc:asms) | Just i32 <- mi32 (m-(self+ix+5)) =
let instr = 0xe8:le i32
in instr ++ asm (ix+5) st asms
...
This is the second pass, it takes offsets computed in the first phase and
absolute addresses of malloc
, free
and returns machine code.
Putting it all together:
pI :: Ptr a -> Int
pI = ((IntPtr i) -> i) . ptrToIntPtr
hasMa :: [X86 reg freg a] -> Bool
hasMa = any g where
g Call{} = True
g _ = False
prepAddrs :: [X86 reg freg a] -> IO (Maybe (Int, Int))
prepAddrs ss = if hasMa ss then Just <$> mem' else pure Nothing
allFp :: [X86 X86Reg FX86Reg a] -> IO (FunPtr b)
allFp instrs = do
let (sz, lbls) = mkIx 0 instrs
(fn, p) <- do
res <- prepAddrs instrs
case res of
Just (m, _) -> (res,) <$> allocNear m (fromIntegral sz)
_ -> (res,) <$> allocExec (fromIntegral sz)
let b = BS.pack$asm 0 (pI p, fn, lbls) instrs
finish b p
See the Apple compiler for full details.
Coda
To call the assembled function in Haskell, we can generate FFI wrappers with GHC:
foreign import ccall "dynamic" ff :: FunPtr (Double -> Double) -> Double -> Double
However, given that we are writing a JIT, we may wish to have more flexibility; in particular, we would like to be able to call functions whose type is unknown when we compile the compiler. Fortunately libffi solves exactly this problem and it has Haskell bindings.
Suppose we have machine code for a procedure that returns
a double-precision float, stored in a function pointer fp
. Then:
callFFI fp retCDouble []