Suppose we want to count the number of lines in a file. Rust has a crate that would seem to help us, namely, bytecount.

Though this crate is heavily optimized, we can beat its performance with a simple for* loop fed to GCC 9.

Here is the ATS code corresponding to bytecount::count (the function in question is count_lines_for_loop.

staload UN = "prelude/SATS/unsafe.sats"

fn byteview_read_as_char {l0:addr}{m:nat}{ l1 : addr | l1 <= l0+m }(pf : !bytes_v(l0, m) | p : ptr(l1)) : char = $UN.ptr0_get<char>(p)

fn count_lines_for_loop { l : addr | l != null }{m:nat}{ n : nat | n <= m }( pf : !bytes_v(l, m) | ptr : ptr(l), bufsz : size_t(n)) : int = let var res: int = 0 var i: size_t val () = for* { i : nat | i <= m } .<i>. (i : size_t(i)) => (i := bufsz ; i != 0 ; i := i - 1) (let var current_char = byteview_read_as_char(pf | add_ptr_bsz(ptr, i)) in case+ current_char of | '\n' => res := res + 1 | _ => () end) var current_char = byteview_read_as_char(pf | ptr) val () = case+ current_char of | '\n' => res := res + 1 | _ => () in res end

I benchmarked this by using count_lines_for_loop and bytecount::count to count the number of lines in the SQLite amalgamation:

bytecount (sqlite3.c) time: [709.61 us 712.62 us 715.75 us]

sqlite.c (for loop) estimate: 617.200826 μs sqlite.c (bytecount) estimate: 700.684214 μs

As you can see, count_lines_for_loop outperforms bytecount::count.

The full benchmarking code is available here.

Conclusions

First, ATS is suitable for fast code; though patscc frequently produces impenetrable C, GCC is able to optimize it just fine.

Second, GCC is impressive - though bytecount::count is heavily optimized GCC manages to do slightly better.

One note: though bytecount makes use of unsafe, the above code does not! So it has all the correctness guaranteed by GCC.