Suppose we want to count the number of lines in a file. Rust has a crate that would seem to help us, namely, bytecount.
Though this crate is heavily optimized, we can beat its
performance with a simple for*
loop fed to GCC 8.
Here is the ATS code corresponding to bytecount::count
(the function in
question is count_lines_for_loop
.
staload UN = "prelude/SATS/unsafe.sats"
fn byteview_read_as_char {l0:addr}{m:nat}{ l1 : addr | l1 <= l0+m }(pf : !bytes_v(l0, m) | p : ptr(l1)) : char =
$UN.ptr0_get<char>(p)
fn count_lines_for_loop { l : addr | l != null }{m:nat}{ n : nat | n <= m }( pf : !bytes_v(l, m)
| ptr : ptr(l), bufsz : size_t(n)) : int =
let
var res: int = 0
var i: size_t
val () = for* { i : nat | i <= m } .<i>. (i : size_t(i)) =>
(i := bufsz ; i != 0 ; i := i - 1)
(let
var current_char = byteview_read_as_char(pf | add_ptr_bsz(ptr, i))
in
case+ current_char of
| '\n' => res := res + 1
| _ => ()
end)
var current_char = byteview_read_as_char(pf | ptr)
val () = case+ current_char of
| '\n' => res := res + 1
| _ => ()
in
res
end
I benchmarked this by using count_lines_for_loop
and bytecount::count
to
count the number of lines in the SQLite amalgamation:
bytecount (sqlite3.c) time: [709.61 us 712.62 us 715.75 us]
sqlite.c (for loop)
estimate: 617.200826 μs
sqlite.c (bytecount)
estimate: 700.684214 μs
As you can see, count_lines_for_loop
outperforms bytecount::count
.
The full benchmarking code is available here.
Conclusions
First, ATS is suitable for fast code; though patscc
frequently produces
impenetrable C, GCC is able to optimize it just fine.
Second, GCC is impressive - though bytecount::count
is heavily
optimized GCC manages to do
slightly better.
One note: though bytecount
makes use of unsafe
, the above code does not! So
it has all the correctness guaranteed by GCC.