Benchmarking the Edit Distance

by Vanessa McHale | 2018-07-27 17:54

As a follow-up to my post on computing the Levenshtein distance in ATS, I figured I'd write up some of the actual benchmark results, as well as some of the subtleties involved in benchmarking various ecosystems.

Benchmarking Libraries

Haskell

For benchmarking Haskell, I use criterion, which has become the de facto standard. It uses linear regression to benchmark performance and accounts for garbage collection.

ATS

There is little support for benchmarking ATS, and thus I had to write my own benchmarking library. The library I ended up with is limited compared to more mature solutions in Rust or Haskell, but it does give accurate results. Like criterion, it uses linear regression to make meaningful measurements that account for garbage collection.

Since ATS compiles to C, I was able to benchmark the C code using the same benchmarking library.

Rust

I used a Rust port of criterion, viz. criterion-rs.

Results

These are presented with the rather large caveat that the ATS and C versions involved the garbage collector in some capacity, and thus may be faster in practice.

The task was finding the Levenshtein distance between "exclude" and "excude". You can find the various implementations and benchmarking harnesses here.

Language	Library	Unicode	Time
ATS	`edit-distance`		67.61 ns
Rust	`levenshtein-rs`	X	137.4 ns
Rust	`strsim`	X	140.4 ns
Haskell	`edit-distance`	X	413.5 ns
C	n/a		93.70 ns

Note that these benchmarks use libgc for ATS and C and thus the ATS and C implementations may be faster in practice.

I was impressed with Rust here - the ATS implementation makes no effort to support Unicode, so the fact that the Rust implementation takes twice as long is more defensible. However, I suspect that improving performance while maintaining safety would be impossible; in this regard ATS stands head and shoulders above the competitors.

return

blog

Benchmarking the Edit Distance

Benchmarking Libraries

Haskell

ATS

Rust

Results