One common oversight in Haskell compilers is failing to intern identifiers
using Int
s and failing to prefer IntMap
s and IntSet
s. The PureScript compiler,
for instance, uses Map
s as of writing.
I always do the following:
newtype Unique = Unique { unUnique :: Int }
data Name = Name { name :: T.Text
, unique :: !Unique
...
}
instance Eq (Name a) where
(==) (Name _ u) (Name _ u') = u == u'
instance Ord (Name a) where
compare (Name _ u) (Name _ u') = compare u u'
Note that making the name
field lazy makes things faster.
Using Unique
s makes equality and comparisons faster; moreover, one can use
Name
s as keys in IntMap
s and IntSet
s, which have more efficient
implementations. I came across the dramatic performance advantage in my Kempe compiler; liveness analysis is drastically faster with IntSet
s compared to Set
s.