Suppose we want to count the number of lines in a file. Rust has a crate that would seem to help us, namely, bytecount.

Though this crate is heavily optimized, we can beat its performance with a simple for* loop fed to GCC 9.

Here is the ATS code corresponding to bytecount::count (the function in question is count_lines_for_loop.

staload UN = "prelude/SATS/unsafe.sats"fn byteview_read_as_char {l0:addr}{m:nat}{ l1 : addr | l1 <= l0+m }(pf : !bytes_v(l0, m) | p : ptr(l1)) : char =
\$UN.ptr0_get<char>(p)fn count_lines_for_loop { l : addr | l != null }{m:nat}{ n : nat | n <= m }( pf : !bytes_v(l, m)
| ptr : ptr(l), bufsz : size_t(n)) : int =
let
var res: int = 0
var i: size_t
val () = for* { i : nat | i <= m } .<i>. (i : size_t(i)) =>
(i := bufsz ; i != 0 ; i := i - 1)
(let
in
case+ current_char of
| '\n' => res := res + 1
| _ => ()
end)
var current_char = byteview_read_as_char(pf | ptr)
val () = case+ current_char of
| '\n' => res := res + 1
| _ => ()
in
res
end

I benchmarked this by using count_lines_for_loop and bytecount::count to count the number of lines in the SQLite amalgamation:

bytecount (sqlite3.c)   time:   [709.61 us 712.62 us 715.75 us]sqlite.c (for loop)
estimate: 617.200826 μs
sqlite.c (bytecount)
estimate: 700.684214 μs

As you can see, count_lines_for_loop outperforms bytecount::count.

The full benchmarking code is available here.

# Conclusions

First, ATS is suitable for fast code; though patscc frequently produces impenetrable C, GCC is able to optimize it just fine.

Second, GCC is impressive - though bytecount::count is heavily optimized GCC manages to do slightly better.

One note: though bytecount makes use of unsafe, the above code does not! So it has all the correctness guaranteed by GCC.