Compiler Performance in Haskell

by Vanessa McHale | 2021-01-05 17:09

I stumbled across a comment by András Kovács on compiler performance, which brought up some of the difficulties writing adequate compilers in lower-level languages such as C++ or Rust.

Here, I'd like to report my experience with my Kempe toy compiler, which is written in Haskell.

I benchmarked compilation of the splitmix pseudorandom number generator:

typedef unsigned long int __uint64_t;
typedef __uint64_t uint64_t;
// modified to have ""multiple return"" (destination-passing style)
uint64_t next(uint64_t x, uint64_t* y) {
    uint64_t z = (x += 0x9e3779b97f4a7c15);
    z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
    z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
    *y = x;
    return z ^ (z >> 31);
}

First, the C compilers (alongside kc, the Kempe compiler):

benchmarking bench/gcc-8 -fsyntax-only benchmarks/splitmix64.c
time                 11.16 ms   (9.301 ms .. 13.16 ms)
                     0.796 R²   (0.644 R² .. 0.902 R²)
mean                 11.37 ms   (10.04 ms .. 12.50 ms)
std dev              3.028 ms   (2.518 ms .. 3.741 ms)
variance introduced by outliers: 89% (severely inflated)
benchmarking bench/clang-11 -fsyntax-only benchmarks/splitmix64.c
time                 14.26 ms   (14.03 ms .. 14.53 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 14.34 ms   (14.25 ms .. 14.49 ms)
std dev              285.7 μs   (198.2 μs .. 404.4 μs)
benchmarking bench/kc typecheck examples/splitmix.kmp
time                 4.915 ms   (4.687 ms .. 5.105 ms)
                     0.964 R²   (0.904 R² .. 0.993 R²)
mean                 4.597 ms   (4.321 ms .. 4.721 ms)
std dev              537.5 μs   (210.8 μs .. 1.013 ms)
variance introduced by outliers: 71% (severely inflated)
benchmarking bench/icc -O0 -c benchmarks/splitmix64.c
time                 24.55 ms   (20.47 ms .. 28.48 ms)
                     0.902 R²   (0.786 R² .. 0.968 R²)
mean                 31.23 ms   (28.70 ms .. 35.11 ms)
std dev              6.442 ms   (4.370 ms .. 10.15 ms)
variance introduced by outliers: 75% (severely inflated)
benchmarking bench/gcc-8 -O0 -c benchmarks/splitmix64.c
time                 20.61 ms   (17.52 ms .. 24.59 ms)
                     0.903 R²   (0.833 R² .. 0.970 R²)
mean                 19.76 ms   (18.48 ms .. 21.07 ms)
std dev              3.250 ms   (2.695 ms .. 4.119 ms)
variance introduced by outliers: 70% (severely inflated)
benchmarking bench/clang-11 -O0 -c benchmarks/splitmix64.c
time                 16.24 ms   (16.12 ms .. 16.36 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 16.25 ms   (16.19 ms .. 16.33 ms)
std dev              168.8 μs   (125.5 μs .. 228.4 μs)
benchmarking bench/kc examples/splitmix.kmp splitmix.o
time                 9.441 ms   (6.886 ms .. 12.19 ms)
                     0.711 R²   (0.521 R² .. 0.884 R²)
mean                 10.18 ms   (8.969 ms .. 11.18 ms)
std dev              3.025 ms   (2.601 ms .. 3.538 ms)
variance introduced by outliers: 93% (severely inflated)

On the frontend, the C compilers are unimpressive. Kempe does not have operator overloading, but that is not enough to explain why gcc and clang take 2-3 times as long. icc is questionable; icc -fsyntax-only isn't any faster than icc -O0.

On the backend, kc holds its own: it's clearly faster than gcc, icc, and clang; the consolation is that the C compilers hopefully produce better code. In any case it seems plausible that a compiler written in a Haskell could compete with compilers written in C++.

benchmarking bench/rustc --crate-type=lib --emit=dep-info,metadata benchmarks/splitmix.rs
time                 77.11 ms   (76.79 ms .. 77.53 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 77.01 ms   (76.82 ms .. 77.26 ms)
std dev              370.4 μs   (236.4 μs .. 540.1 μs)
benchmarking bench/~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/rustc --crate-type=lib -Z no-codegen benchmarks/splitmix.rs
time                 23.78 ms   (20.66 ms .. 26.90 ms)
                     0.914 R²   (0.831 R² .. 0.974 R²)
mean                 23.52 ms   (22.12 ms .. 25.99 ms)
std dev              4.073 ms   (2.490 ms .. 5.833 ms)
variance introduced by outliers: 72% (severely inflated)
benchmarking bench/rustc --crate-type=lib benchmarks/splitmix.rs
time                 80.87 ms   (79.95 ms .. 81.43 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 81.10 ms   (80.76 ms .. 82.41 ms)
std dev              984.7 μs   (249.9 μs .. 1.660 ms)
benchmarking bench/rustc --crate-type=cdylib benchmarks/splitmix.rs
time                 157.2 ms   (156.4 ms .. 158.9 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 157.6 ms   (157.0 ms .. 158.6 ms)
std dev              1.229 ms   (704.6 μs .. 1.798 ms)
variance introduced by outliers: 12% (moderately inflated)

rustc presumably does more work than the C compilers, but we do see that creating dynamic libraries (--crate-type=cdylib) takes too long.

Also, for some reason it's impossible to only do typechecking without nightly.

Conclusion

The Kempe toy compiler is around 4000 lines, and depends mainly on the containers library, i.e. off-the-shelf data structures.

There's a lot of low-hanging fruit for compilers written in C++. My hope would be that we see more use of functional languages in the field of compilers, and more demand for performant compilers.

Appendix

Here is the Kempe source code:

; given a seed, return a random value and the new seed
next : Word -- Word Word
     =: [ 0x9e3779b97f4a7c15u +~ dup
          dup 30i8 >>~ xoru 0xbf58476d1ce4e5b9u *~
          dup 27i8 >>~ xoru 0x94d049bb133111ebu *~
          dup 31i8 >>~ xoru
        ]
%foreign kabi next

and the Rust source code:

#[no_mangle]
pub extern "C" fn next(x: u64) -> (u64, u64) {
    let next_seed = x + 0x9e3779b97f4a7c15;
    let mut z = next_seed;
    z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
    z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
    return (z ^ (z >> 31), next_seed);
}

return

blog

Compiler Performance in Haskell

Conclusion

Appendix