This blogpost is written in reproducible Literate Haskell, so we need some imports first.
{# LANGUAGE DeriveFoldable #}
{# LANGUAGE DeriveFunctor #}
{# LANGUAGE DeriveTraversable #}
module Main where
import qualified Codec.Picture as JP
import qualified Codec.Picture.Types as JP
import Control.Monad.ST (runST)
import Data.Bool (bool)
import Data.Foldable (for_)
import Data.List (isSuffixOf, partition)
import Data.List.NonEmpty (NonEmpty (..))
import System.Environment (getArgs)
import System.Random (RandomGen, newStdGen)
import System.Random.Stateful (randomM, runStateGen)
import Text.Read (readMaybe)
Haskell is not my only interest — I have also been quite into photography for the past decade. Recently, I was considering moving some of the stuff I have on various social networks to a selfhosted solution.
Tumblr in particular has a fairly nice way to do photo sets, where these can be organised in rows and columns. I wanted to see if I could mimic this in a recursive way, where rows and columns can be subdivided further.
One important constraint is that is that we want to present each picture as the photographer envisioned it: concretely, we can scale it up or down (preserving the aspect ratio), but we can’t crop out parts.
Order is also important in photo essays, so we want the author to specify the photo collage in a declarative way by indicating if horizontal (H) or vertical (V) subdivision should be used, creating a tree. For example:
H img1.jpg
(V img2.jpg
(H img3.jpg
img4.jpg))
The program should then determine the exact size and position of each image, so that we get a fully filled rectangle without any borders or filler:
We will use a technique called circular programming that builds on Haskell’s
laziness to achieve this in an elegant way.
These days, it is maybe more commonly referred to as the repmin
problem.
This was first described by Richard S. Bird in “Using circular programs to
eliminate multiple traversals of data” in 1984, which predates Haskell!
repmin
please…
Given a simple tree type:
data Tree a
= Leaf a
 Branch (Tree a) (Tree a)
We would like to write a function repmin
which replaces each value in each
Leaf
with the global minimum in the tree. This is easily done by first
finding the global minimum, and then replacing it everywhere:
repmin_2pass :: Ord a => Tree a > Tree a
=
repmin_2pass t let globalmin = findmin t in rep globalmin t
where
Leaf x) = x
findmin (Branch l r) = min (findmin l) (findmin r)
findmin (
Leaf _) = Leaf x
rep x (Branch l r) = Branch (rep x l) (rep x r) rep x (
However, this requires two passes over the tree. We can do better by using Haskell’s laziness:
repmin_1pass :: Ord a => Tree a > Tree a
= t'
repmin_1pass t where
= repmin t
(t', globalmin)
Leaf x) = (Leaf globalmin, x)
repmin (Branch l r) =
repmin (Branch l' r', min lmin rmin)
(where
= repmin l
(l', lmin) = repmin r (r', rmin)
There is an apparent circular dependency here, where repmin
uses globalmin
,
but also computes it. This is possible because we never need to evaluate
globalmin
– it can be stored as a thunk.
For more details, please see the very accessible original paper
(https://doi.org/10.1007/BF00264249
).
We start out by giving an elegant algebraic definition for a collage:
data Collage a
= Singleton a
 Horizontal (Collage a) (Collage a)
 Vertical (Collage a) (Collage a)
deriving (Foldable, Functor, Show, Traversable)
We use a higherorder type, which allows us to work with collages of filepaths
as well as actual images (among other things). deriving
instructs the compiler
to generate some boilerplate code for us. This allows us to concisely read all
images using traverse
:
readCollage :: Collage FilePath
> IO (Collage (JP.Image JP.PixelRGB8))
= traverse $ \path >
readCollage >>=
JP.readImage path either fail (pure . JP.convertRGB8)
We use the JuicyPixels library to read and write images. The image type in this library can be a bit verbose since it is parameterised around the colour space.
During the layout pass, we don’t really care about this complexity. We only need the relative sizes of the images and not their content. We introduce a typeclass to do just that:
data Size = Sz
szWidth :: Rational
{ szHeight :: Rational
,deriving (Show)
}
class Sized a where
  Retrieve the width and height of an image.
 Both numbers must be strictly positive.
sizeOf :: a > Size
We use the Rational
type for width and height.
We are only subdividing the 2D space, so we do not need irrational numbers,
and having infinite precision is convenient.
The instance for the JuicyPixels image type is simple:
instance Sized (JP.Image p) where
= Sz
sizeOf img = fromIntegral $ JP.imageWidth img
{ szWidth = fromIntegral $ JP.imageHeight img
, szHeight }
If we look at the finished image, it may seem like a hard problem to find a configuration that fits all the images with a correct aspect ratio.
But we can use induction to arrive at a fairly straightforward solution. Given two images, it is always possible to put them beside or above each other by scaling them up or down to match them in height or width respectively. This creates a bigger image. We can then repeat this process until just one image is left.
However, this is quite a naive approach since we end up making way too many copies, and the repeated resizing could also result in a loss of resolution. We would like to compute the entire layout first, and then render everything in one go. Still, we can start by formalising what happens for two images and then work our way up.
We can represent the layout of an individual image by its position and size. We use simple (x, y) coordinates for the position and a scaling factor (relative to the original size of the image) for its size.
data Transform = Tr
trX :: Rational
{ trY :: Rational
, trScale :: Rational
,deriving (Show) }
Armed with the Size
and Transform
types, we have enough to tackle the
“mathy” bits.
Let’s look at the horizontal case first and write a function that computes a transform for both left and right images, as well as the size of the result.
horizontal :: Size > Size > (Transform, Transform, Size)
Sz lw lh) (Sz rw rh) = horizontal (
We want to place image l
beside image r
, producing a nicely filled
rectangle. Intuitively, we should be matching the height of both images.
There are different ways to do this — we could shrink the taller image, enlarge the shorter image, or something in between. We make a choice to always shrink the taller image, as this doesn’t compromise the sharpness of the result.
let height = min lh rh
= height / lh
lscale = height / rh
rscale = lscale * lw + rscale * rw in width
With the scale for both left and right images, we can compute the left and right transforms. The left image is simply placed at (0, 0) and we need to offset the right image depending on the (scaled) size of the left image.
Tr 0 0 lscale
( Tr (lscale * lw) 0 rscale
, Sz width height
, )
Composing images vertically is similar, just matching the widths rather than the heights of the two images and moving the bottom image below the top one:
vertical :: Size > Size > (Transform, Transform, Size)
Sz tw th) (Sz bw bh) =
vertical (let width = min tw bw
= width / tw
tscale = width / bw
bscale = tscale * th + bscale * bh in
height Tr 0 0 tscale
( Tr 0 (tscale * th) bscale
, Sz width height
, )
Now that we’ve solved the problem of combining two images and placing them, we can apply this to our tree of images. To this end, we need to compose multiple transformations.
Whenever we think about composing things in Haskell, it’s good to ask ourselves
if what we’re trying to compose is a Monoid. A Monoid needs an identity
element (mempty
) and a Semigroup instance, the latter of which contains just
an associative binary operator (<>
).
The identity transform is just offsetting by 0 and scaling by 1:
instance Monoid Transform where
mempty = Tr 0 0 1
Composing two transformations using <>
requires a bit more thinking.
In this case, a <> b
means applying transformation a
after transformation
b
, so we will need to apply the scale of b
to all parts of a
:
instance Semigroup Transform where
Tr ax ay as <> Tr bx by bs =
Tr (ax * bs + bx) (ay * bs + by) (as * bs)
Readers who are familiar with linear algebra may recognise the connection to a sort of restricted affine 2D transformation matrix.
Proving that the identity holds on mempty
is simple so we will only do one
side, namely a <> mempty == a
.
Tr ax ay as <> mempty
 Definition of mempy
= Tr ax ay as <> Tr 0 0 1
 Definition of <>
= Tr (ax * 1 + 0) (ay * 1 + 0) (as * 1)
 Cancellative property of 0 over +
 Identity of 1 over *
= Tr ax ay as
Next, we want to prove that the <>
operator is associative, meaning
a <> (b <> c) == (a <> b) <> c
.
Tr ax ay as <> (Tr bx by bs <> Tr cx cy cs)
 Definition of <>
= Tr (ax * (bs * cs) + (bx * cs + cx))
* (bs * cs) + (by * cs + cy))
(ay * (bs * cs))
(as
 Associativity of * and +
= Tr (ax * bs * cs + bx * cs + cx)
* bs * cs + by * cs + cy)
(ay * bs) * cs)
((as
 Distributivity of * over +
= Tr ((ax * bs + bx) * cs + cx)
* bs + by) * cs + cy)
((ay * bs) * cs)
((as
 Definition of <>
= (Tr ax ay as <> T b by bs) <> Tr cx cy cs
Now that we have a valid Monoid instance, we can use the higherlevel <>
and mempty
concepts in our core layout algorithm, rather than worrying over
details like (x, y) coordinates and scaling factors.
Our main layoutCollage
function takes the userspecified tree as input,
and annotates each element with a Transform
.
In addition to that, we also produce the Size
of the final image so we can
allocate space for it.
layoutCollage :: Sized img
=> Collage img
> (Collage (img, Transform), Size)
All layoutCollage
does is call layout
— our circular program — with
the identity transformation:
= layout mempty layoutCollage
layout
takes the size and position of the current element as an argument,
and determines the sizes and positions of a tree recursively.
There are some similarities with the algorithms present in browser engines, where a parent element will first lay out its children, and then use their properties to determine its own width.
However, we will use Haskell’s laziness to do this in a single topdown pass. We provide a declarative algorithm and we leave the decision about what to calculate when — more concretely, propagating the requested sizes of the children back up the tree before constructing the transformations — to the compiler!
layout :: Sized img
=> Transform
> Collage img
> (Collage (img, Transform), Size)
Placing a single image is easy, since we are receiving the transformation directly as an argument. We return the requested size — which is just the original size of the image. This is an important detail in making the laziness work here: if we tried to return the final size (including the passed in transformation) rather than the requested size, the computation would diverge (i.e. recurse infinitely).
Singleton img) =
layout trans (Singleton (img, trans), sizeOf img) (
In the recursive case for horizontal composition, we call the horizontal
helper we defined earlier with the left and right image sizes as arguments.
This gives us both transformations, that we can then pass in as arguments to
layout
again – returning the left and right image sizes we pass in to the
horizontal
helper, forming our apparent circle.
Horizontal l r) =
layout trans (Horizontal l' r', size)
(where
= layout (ltrans <> trans) l
(l', lsize) = layout (rtrans <> trans) r
(r', rsize) = horizontal lsize rsize (ltrans, rtrans, size)
The same happens for the vertical case:
Vertical t b) =
layout trans (Vertical t' b', size)
(where
= layout (ttrans <> trans) t
(t', tsize) = layout (btrans <> trans) b
(b', bsize) = vertical tsize bsize (ttrans, btrans, size)
It’s worth thinking about why this works: the intuitive explanation is that we can “delay” the execution of the transformations until the very end of the computation, and then fill them in everywhere. This works since no other parts of the algorithm depend on the transformation, only on the requested sizes.
We’ve written a circular program! Although I was aware of repmin
for a long
time, it’s not a technique I’ve applied often. To me, it is quite interesting
because, compared to repmin
:
The structure is also somewhat different; rather than having a circular step at the toplevel function invocation, we have it at every step of the recursion.
Thanks to Francesco Mazzoli and Titouan Vervack reading a draft of this blogpost and suggesting improvements. And thanks to you for reading!
What follows below are a number of relatively small functions that take care of various tasks, included so this can function as a standalone program:
Once we’ve determined the layout, we still need to apply it and draw all the images using the computed transformations. We use simple nearestneighbour scaling since that is not the focus of this program, you could consider Lánczos interpolation in a real application.
render :: Foldable f
=> Size
> f (JP.Image JP.PixelRGB8, Transform)
> JP.Image JP.PixelRGB8
Sz width height) images = runST $ do
render (< JP.createMutableImage (round width) (round height) black
canvas $ transform canvas
for_ images
JP.unsafeFreezeImage canvaswhere
= JP.PixelRGB8 0 0 0
black
Tr dstX dstY dstS) =
transform canvas (img, round dstX .. round (dstX + dstW)  1] $ \outX >
for_ [round dstY .. round (dstY + dstH)  1] $ \outY >
for_ [let inX = min (JP.imageWidth img  1) $ round $
fromIntegral (outX  round dstX) / dstS
= min (JP.imageHeight img  1) $ round $
inY fromIntegral (outY  round dstY) / dstS in
$ JP.pixelAt img inX inY
JP.writePixel canvas outX outY where
= fromIntegral (JP.imageWidth img) * dstS
dstW = fromIntegral (JP.imageHeight img) * dstS dstH
We use a simple parser to allow the user to specify collages as a string, for example on the command line. This is a natural fit for polish notation as using parentheses in command line arguments is very awkward.
As an example, we want to parse the following arguments:
H img1.jpg V img2.jpg H img3.jpg img4.jpg
Into this tree:
(Horizontal "img1.jpg"
(Vertical "img2.jpg")
(Horizontal "img3.jpg" "img4.jpg"))
We don’t even need a parser library, we can just treat the arguments as a stack:
parseCollage :: [String] > Maybe (Collage FilePath)
= do
parseCollage args < parseTree args
(tree, []) pure tree
where
= Nothing
parseTree [] "H" : stack0) = do
parseTree (< parseTree stack0
(x, stack1) < parseTree stack1
(y, stack2) pure (Horizontal x y, stack2)
"V" : stack0) = do
parseTree (< parseTree stack0
(x, stack1) < parseTree stack1
(y, stack2) pure (Vertical x y, stack2)
: stack0) = Just (Singleton x, stack0) parseTree (x
In order to test this program, I also added some functionality to generate random collages.
randomCollage :: RandomGen g => NonEmpty a > g > (Collage a, g)
= runStateGen gen $ \g > go g ne
randomCollage ne gen where
The utility rc
picks a random constructor.
= bool Horizontal Vertical <$> randomM g rc g
In our worker function, we keep one item on the side (x
), and randomly decide
if other items will go in the left or right subtree:
: xs) = do
go g (x < partition snd <$>
(lts, rts) traverse (\y > (,) y <$> randomM g) xs
Then, we look at the random partitioning we just created. If they’re both empty, the only thing we can do is create a singleton collage:
case (map fst lts, map fst rts) of
> pure $ Singleton x ([], [])
If either of them is empty, we put x
in the other partition to ensure we don’t
create invalid empty trees:
: ls), []) > rc g <*> go g (l : ls) <*> go g (x : [])
((l : rs)) > rc g <*> go g (x : []) <*> go g (r : rs) ([], (r
Otherwise, we decide at random which partition x
goes into:
: ls), (r : rs)) > do
((l < randomM g
xLeft if xLeft
then rc g <*> go g (x : l : ls) <*> go g (r : rs)
else rc g <*> go g (l : ls) <*> go g (x : r : rs)
We support two modes of operation for our little CLI:
In both cases, we also take an output file as the first argument, so we know
where we want to write the image to. We also take an optional fit
flag so
we can resize the final image down to a requested size.
data Command = Command
cmdOut :: FilePath
{ cmdFit :: Maybe Int
, cmdCollage :: CommandCollage
,
}
data CommandCollage
= User (Collage FilePath)
 Random (NonEmpty FilePath)
deriving (Show)
There is some setup to parse the output and a fit
flag. The important
bit happens in parseCommandCollage
further down.
parseCommand :: [String] > Maybe Command
= case cmd of
parseCommand cmd > Nothing
[] "fit" : num : args)  Just n < readMaybe num > do
(< parseCommand args
cmd' pure cmd' {cmdFit = Just n}
: args) > Command o Nothing <$> parseCommandCollage args (o
We’ll use R
for a random collage, and H
/V
will be parsed by
parseCollage
.
where
"R" : x : xs) = Just $ Random (x : xs)
parseCommandCollage (= User <$> parseCollage spec parseCommandCollage spec
Time to put everything together in the main
function. First, we do some
parsing:
main :: IO ()
= do
main < getArgs
args < maybe (fail "invalid command") pure $
command
parseCommand args< case cmdCollage command of
pathsCollage User explicit > pure explicit
Random paths > do
< newStdGen
gen let (random, _) = randomCollage paths gen
pure random
Followed by actually reading in all the images:
< readCollage pathsCollage imageCollage
This gives us the Collage (JP.Image JP.PixelRGB8)
. We can pass that to our
layout
function and write it to the output, after optionally applying our
fit
:
let (result, box) = case cmdFit command of
Nothing > layoutCollage imageCollage
Just f > fit f $ layoutCollage imageCollage
$ JP.ImageRGB8 $ render box result
write (cmdOut command) where
write output ".jpg" `isSuffixOf` output = JP.saveJpgImage 80 output
 otherwise = JP.savePngImage output
Most of the time I don’t want to host fullresolution pictures for web viewing. This is an addition I added later on to resize an image down to a requested “long edge” (i.e. a requested maximum width or height, whichever is bigger).
Interestingly I think this can also be done by having an additional parameter
to layout
, and using circular programming once again to link the initial
transformation to the requested size. However, the core algorithm is harder
to understand that way, so I left it as a separate utility:
fit :: Int
> (Collage (img, Transform), Size)
> (Collage (img, Transform), Size)
Sz w h)
fit longEdge (collage,  long <= fromIntegral longEdge = (collage, Sz w h)
 otherwise =
fmap (<> tr) <$> collage, Sz (w * scale) (h * scale))
(where
= max w h
long = fromIntegral longEdge / long
scale = Tr 0 0 scale tr
Here a simple warmup exercise just to get an idea of how it works:
Let’s continue with iterate
but do a real puzzle next. What Monad are we
looking for?
How is e defined again?
let
allows us to reuse code. This is very useful if you don’t have enough
tokens to make a sensible program.
Here is the final puzzle – but how can we produce a string from a bunch of numbers?
Haskell evaluation powered by tryhaskell.org. UI powered by some messy JavaScript.
We playtested these puzzles with simple pieces of paper during the event. At the final presentation we played a webbased multiplayer version where each player controls only one token.
We actually found the single player to be more fun, and since we already had some client code I decided to clean it up a bit and make a single player version available here.
Thanks for playing!
]]>Haskell’s laziness allows you to do many cool things. I’ve talked about searching an infinite graph before. Another commonly mentioned example is finding the smallest N items in a list.
Because programmers are lazy as well, this is often defined as:
smallestN_lazy :: Ord a => Int > [a] > [a]
= take n . sort smallestN_lazy n
This happens regardless of the language of choice if we’re confident that the list will not be too large. It’s more important to be correct than it is to be fast.
However, in strict languages we’re really sorting the entire list before taking the first N items. We can implement this in Haskell by forcing the length of the sorted list.
smallestN_strict :: Ord a => Int > [a] > [a]
= let l1 = sort l0 in length l1 `seq` take n l1 smallestN_strict n l0
If you’re at least somewhat familiar with the concept of laziness, you may
intuitively realize that the lazy version of smallestN
is much better since
it’ll only sort as far as it needs.
But how much better does it actually do, with Haskell’s default sort
?
For the sake of the comparison, we can introduce a third algorithm, which does
a slightly smarter thing by keeping a heap of the smallest elements it has seen
so far. This code is far more complex than smallestN_lazy
, so if it performs
better, we should still ask ourselves if the additional complexity is worth it.
smallestN_smart :: Ord a => Int > [a] > [a]
= do
smallestN_smart maxSize list < Map.toList heap
(item, n) replicate n item
where
 A heap is a map of the item to how many times it occurs in
 the heap, like a frequency counter. We also keep the current
 total count of the heap.
= fst $ foldl' (\acc x > insert x acc) (Map.empty, 0) list
heap
insert x (heap0, count) count < maxSize = (Map.insertWith (+) x 1 heap0, count + 1)
 otherwise = case Map.maxViewWithKey heap0 of
Nothing > (Map.insertWith (+) x 1 heap0, count + 1)
Just ((y, yn), _) > case compare x y of
EQ > (heap0, count)
GT > (heap0, count)
LT >
let heap1 = Map.insertWith (+) x 1 heap0 in
if yn > 1
then (Map.insert y (yn  1) heap1, count)
else (Map.delete y heap1, count)
So, we get to the main trick I wanted to talk about: how do we benchmark this, and can we add unit tests to confirm these benchmark results in CI? Benchmark execution times are very fickle. Instruction counting is awesome but perhaps a little overkill.
Instead, we can just count the number of comparisons.
We can use a new type that holds a value and a number of ticks. We can increase the number of ticks, and also read the ticks that have occurred.
data Ticks a = Ticks {ref :: !(IORef Int), unTicks :: !a}
mkTicks :: a > IO (Ticks a)
= Ticks <$> IORef.newIORef 0 <*> pure x
mkTicks x
tick :: Ticks a > IO ()
= IORef.atomicModifyIORef' (ref t) $ \i > (i + 1, ())
tick t
ticks :: Ticks a > IO Int
= IORef.readIORef . ref ticks
smallestN
has an Ord
constraint, so if we want to count the number of
comparisons we’ll want to do that for both ==
and compare
.
instance Eq a => Eq (Ticks a) where
==) = tick2 (==)
(
instance Ord a => Ord (Ticks a) where
compare = tick2 compare
The actual ticking code goes in tick2
, which applies a binary operation and
increases the counters of both arguments. We need unsafePerformIO
for that
but it’s fine since this lives only in our testing code and not our actual
smallestN
implementation.
tick2 :: (a > a > b) > Ticks a > Ticks a > b
= unsafePerformIO $ do
tick2 f t1 t2
tick t1
tick t2pure $ f (unTicks t1) (unTicks t2)
{# NOINLINE tick2 #}
Let’s add some benchmarking that prints an adhoc CSV:
main :: IO ()
= do
main let listSize = 100000
= [smallestN_strict, smallestN_lazy, smallestN_smart]
impls 50, 100 .. 2000] $ \sampleSize > do
forM_ [< replicateM listSize randomIO :: IO [Int]
l < fmap unzip $ forM impls $ \f > do
(nticks, results) < traverse mkTicks l
l1 let !r1 = sum . map unTicks $ f sampleSize l1
< sum <$> traverse ticks l1
t1 pure (t1, r1)
. fail $
unless (equal results) "Different results: " ++ show results
putStrLn . intercalate "," . map show $ sampleSize : nticks
Plug that CSV into a spreadsheet and we get this graph. What conclusions can we draw?
Clearly, both the lazy version as well as the “smart” version are able to avoid a large number of comparisons. Let’s remove the strict version so we can zoom in.
What does this mean?
If the sampleSize
is small, the heap implementation does less comparions.
This makes sense: even if treat sort
as a black box, and don’t look at
it’s implementation, we can assume that it is not optimally lazy; so it
will always sort “a bit too much”.
As sampleSize
gets bigger, the insertion into the bigger and bigger heap
starts to matter more and more and eventually the naive lazy implementation
is faster!
Laziness is awesome and take N . sort
is absolutely the first
implementation you should write, even if you replace it with a more
efficient version later.
Code where you count a number of calls is very easy to do in a test suite.
It doesn’t pollute the application code if we can patch in counting through
a typeclass (Ord
in this case).
Can we say something about the complexity?
The complexity of smallestN_smart
is basically inserting into a heap
listSize
times. This gives us O(listSize * log(sampleSize))
.
That is of course the worst case complexity, which only occurs in the special case where we need to insert into the heap at each step. That’s only true when the list is sorted, so for a random list the average complexity will be a lot better.
The complexity of smallestN_lazy
is far harder to reason about.
Intuitively, and with the information that Data.List.sort
is a merge sort,
I came to something like O(listSize * max(sampleSize, log(listSize)))
.
I’m not sure if this is correct, and the case with a random list seems to be
faster.
I would be very interested in knowing the actual complexity of the lazy version, so if you have any insights, be sure to let me know!
Update: Edward Kmett corrected me:
the complexity of smallestN_lazy
is actually
O(listSize * min(sampleSize, listSize))
, with
O(listSize * min(sampleSize, log(listSize))
in
expectation for a random list.
Thanks to Huw Campbell for pointing out a bug
in the implementation of smallestN_smart
– this is now fixed in the code
above.
Helper function: check if all elements in a list are equal.
equal :: Eq a => [a] > Bool
: y : zs) = x == y && equal (y : zs)
equal (x = True equal _
Haskell is great building at DSLs – which are perhaps the ultimate
form of slacking off at work. Rather than actually doing the work your
manager tells you to, you can build DSLs to delegate this back to your
manager so you can focus on finally writing up that GHC proposal for
MultilinePostfixTypeOperators
(which could have come in useful for this
blogpost).
So, we’ll build a visual DSL that’s so simple even your manager can use it! This blogpost is a literate Haskell file so you can run it directly in GHCi. Note that some code is located in a second module because of compilation stage restrictions.
Let’s get started. We’ll need a few language extensions – not too much, just enough to guarantee job security for the forseeable future.
{# LANGUAGE DataKinds #}
{# LANGUAGE GADTs #}
{# LANGUAGE KindSignatures #}
{# LANGUAGE LambdaCase #}
{# LANGUAGE PolyKinds #}
{# LANGUAGE TypeFamilies #}
{# LANGUAGE TypeOperators #}
module Visual where
And then some imports, not much going on here.
import qualified Codec.Picture as JP
import qualified Codec.Picture.Types as JP
import Control.Arrow
import Control.Category
import Control.Monad.ST (runST)
import Data.Char (isUpper)
import Data.Foldable (for_)
import Data.List (sort, partition)
import qualified Language.Haskell.TH as TH
import Prelude hiding (id, (.))
All Haskell tutorials that use some form of dependent typing seem to start
with the HList
type. So I suppose we’ll do that as well.
data HList (things :: [*]) where
Nil :: HList '[]
Cons :: x > HList xs > HList (x ': xs)
I think HList
is short for hype list. There’s a lot of hype around this
because it allows you to put even more types in your types.
We’ll require two auxiliary functions for our hype list. Because of all the hype, they each require a type family in order for us to even express their types. The first one just takes the last element from a list.
hlast :: HList (thing ': things) > Last (thing ': things)
Cons x Nil) = x
hlast (Cons _ (Cons y zs)) = hlast (Cons y zs) hlast (
type family Last (l :: [*]) :: * where
Last (x ': '[]) = x
Last (x ': xs) = Last xs
Readers may wonder if this is safe, since last
is usually a partial function.
Well, it turns out that partial functions are safe if you type them using
partial type families. So one takeaway is that partial functions can just be
fixed by adding more partial stuff on top. This explains things like Prelude
.
Anyway, the second auxiliary function drops the last element from a list.
hinit :: HList (thing ': things) > HList (Init (thing ': things))
Cons _ Nil) = Nil
hinit (Cons x (Cons y zs)) = Cons x (hinit (Cons y zs)) hinit (
type family Init (l :: [*]) :: [*] where
Init (_ ': '[]) = '[]
Init (x ': y ': zs) = x ': Init (y ': zs)
And that’s enough boilerplate! Let’s get right to it.
It’s always good to pretend that your DSL is built on solid foundations. As I alluded to in the title, we’ll pick Arrows. One reason for that is that they’re easier to explain to your manager than Applicative (stuff goes in, other stuff comes out, see? They’re like the coffee machine in the hallway). Secondly, they are less powerful than Monads and we prefer to keep that good stuff to ourselves.
Unfortunately, it seems like the Arrow module was contributed by an operator
fetishism cult, and anyone who’s ever done nontrivial work with Arrows now
has a weekly therapy session to talk about how &&&
and ***
hurt them.
This is not syntax we want anyone to use. Instead, we’ll, erm, slightly bend Haskell’s syntax to get something that is “much nicer” and “definitely not an abomination”.
We’ll build something that appeals to both Category Theorists (for street cred) and Corporate Managers (for our bonus). These two groups have many things in common. Apart from talking a lot about abstract nonsense and getting paid for it, both love drawing boxes and arrows.
Yeah, so I guess we can call this visual DSL a Diagram
. The main drawback
of arrows is that they can only have a single input and output. This leads to a
lot of tuple abuse.
We’ll “fix” that by having extra ins
and outs
. We are wrapping an arbitrary
Arrow
, referred to as f
in the signature:
data Diagram (ins :: [*]) (outs :: [*]) f a b where
We can create a diagram from a normal arrow, that’s easy.
Diagram :: f a b > Diagram '[] '[] f a b
And we can add another normal function at the back. No biggie.
Then
:: Diagram ins outs f a b > f b c
> Diagram ins outs f a c
Of course, we need to be able to use our extra input and outputs. Output
wraps an existing Diagram
and redirects the second element of a tuple to the
outs
; and Input
does it the other way around.
Output
:: Diagram ins outs f a (b, o)
> Diagram ins (o ': outs) f a b
Input
:: Diagram ins outs f a b
> Diagram (i ': ins) outs f a (b, i)
The hardest part is connecting two existing diagrams. This is really where the magic happens:
Below
:: Diagram ins1 outs1 f a b
> Diagram (Init (b ': outs1)) outs2 f (Last (b ': outs1)) c
> Diagram ins1 outs2 f a c
Is this correct? What does it even mean? The answer to both questions is: “I
don’t know”. It typechecks, which is what really matters when you’re doing
Haskell. And there’s something about ins
matching outs
in there, yeah.
Concerned readers of this blog may at this point be wondering why we used
reasonable names for the constructors of Diagram
rather than just operators.
Well, it’s only because it’s a GADT which makes this impossible. But fear not, we can claim our operators back. Shout out to Unicode’s Boxdrawing characters: they provide various charaters with thick and thin lines. This lets us do an, uhm, super intuitive syntax where tuples are taken apart as extra inputs/outputs, or reified back into tuples.
= Then
(━►) = Output l ━► r
l ┭► r = (l ━► arr (\x > (x, x))) ┭► r
l ┳► r = Input l ━► r
l ┶► r = Output (Input l ━► arr (\x > (x, x))) ━► r
l ╆► r = l ┳► arr (const c)
l ┳ c = Below l r
l ┓ r = Input l ┓ r
l ┧ r = Input l ━► arr snd ┓ r
l ┃ r infixl 5 ━►, ┳►, ┭►, ┶►, ╆►, ┳
infixr 4 ┓, ┧, ┃
Finally, while we’re at it, we’ll also include an operator to clearly indicate to our manager how our valuation will change if we adopt this DSL.
= Diagram (📈)
This lets us do the basics. If we start from regular Arrow syntax:
=
horribleExample01 isUpper >>> reverse *** sort >>> uncurry mappend partition
We can now turn this into:
=
amazingExample01 isUpper)┭►reverse┓
(📈) (partition sort ┶►(uncurry mappend) (📈)
The trick to decrypting these diagrams is that each line in the source code
consists of an arrow where values flow from the left to the right; with possible
extra inputs and ouputs in between. These lines are then composed using a few
operators that use Below
such as ┓
and ┧
.
To improve readability even further, it should also be possible to add righttoleft and toptobottom operators. I asked my manager if they wanted these extra operators but they’ve been ignoring all my Slack messages since I showed them my original prototype. Probably just busy?
Anyway, there are other simple improvements we can make to the visual DSL first.
Most Haskellers prefer nicely aligning things over producing working code,
so it would be nice if we could draw longer lines like ━━━━┳━►
rather than
just ┳►
. And any Haskeller worth their salt will tell you that this is where
Template Haskell comes in.
Template Haskell gets a bad rep, but that’s only because it is mostly misused. Originally, it was designed to avoid copying and pasting a lot of code, which is exactly what we’ll do here. Nothing to be grossed out about.
extensions :: Maybe Char > String > Maybe Char > [String]
=
extensions mbLeft operator mbRight >>= maybe pure goR mbRight >>= maybe pure goL mbLeft
[operator] where
= [replicate n l ++ op  n < [1 .. 19]]
goL l op = [init op ++ replicate n r ++ [last op]  n < [1 .. 19]] goR r op
industryStandardBoilerplate :: Maybe Char > TH.Name > Maybe Char > TH.Q [TH.Dec]
= do
industryStandardBoilerplate l name r < TH.reify name >>= \case
sig TH.VarI _ sig _ > pure sig
> fail "no info"
_ < TH.reifyFixity name >>= maybe (fail "no fixity") pure
fixity pure
[ decl name' < fmap TH.mkName $ extensions l (TH.nameBase name) r
<
, decl TH.SigD name' sig
[ TH.FunD name' [TH.Clause [] (TH.NormalB (TH.VarE name)) []]
, TH.InfixD fixity name'
,
] ]
We can then invoke this industry standard boilerplate to extend and copy/paste an operator like this:
$(industryStandardBoilerplate (Just '━') '(┭►) (Just '─'))
We’re now equipped to silence even the harshest syntax critics:
=
example02 isUpper)━┭─►(reverse)━┓
(📈) (partition sort)─────────┶━►(uncurry mappend) (📈) (
Beautiful! If you’ve ever wondered what people mean when they say functional programs “compose elegantly”, well, this is what they mean.
=
example03 +1)━┳━►(+1)━┓
(📈) (+1)━━━━╆━►add━┓
(📈) (
(📈) add────┶━►addwhere
= uncurry (+) add
Type inference is excellent and running is easy. In GHCi:
*Main> :t example03
example04 :: Diagram '[] '[] (>) Integer Integer
*Main> run example03 1
12
Let’s look at a more complicated example.
=
lambda id)━┭─►(subtract 0.5)━┳━━━━━━━━►(< 0)━━━━━━━━━━┓
(📈) (subtract 0.5)───────╆━►(add)━►(abs)━►(< 0.1)─┶━━━━━━━►(and)━━━━━━━┓
(📈) (* pi)━━►(sin)┳() ┃
(📈) (swap)━┭─►(* 2)──────────────┶━►(sub)━►(abs)━►(< 0.2)─┧
(📈) (or)━►(bool bg fg)
(📈) (where
= uncurry (+)
add = uncurry ()
sub and = uncurry (&&)
or = uncurry ()
= JP.PixelRGB8 69 58 98
fg = JP.PixelRGB8 255 255 255 bg
This renders everyone’s favorite greek letter:
Amazing! Math!
While the example diagrams in this post all use the pure function arrow >
,
it is my duty as a Haskeller to note that it is really parametric in f
or
something. What this means is that thanks to this famous guy called Kleisli,
you can immediately start using this with IO
in production. Thanks for
reading!
Update: CarlHedgren pointed out to me that a similar DSL is provided by Control.Arrow.Needle. However, that package uses Template Haskell to just parse the diagram. In this blogpost, the point of the exercise is to bend Haskell’s syntax and type system to achieve the notation.
The implementation of run
uses a helper function that lets us convert
a diagram back to a normal Arrow
that uses HList
to pass extra inputs
and outputs:
fromDiagram :: Arrow f => Diagram ins outs f a b
> f (a, HList ins) (b, HList outs)
We can then have a specialized version for when there’s zero extra inputs
and outputs. This great simplifies the type signatures and gives us a
“normal” f a b
:
run :: Arrow f => Diagram '[] '[] f a b > f a b
= id &&& (arr (const Nil)) >>> fromDiagram d >>> arr fst run d
The definition for fromDiagram
is as follows:
Diagram f) = f *** arr (const Nil)
fromDiagram (Then l r) = fromDiagram l >>> first r
fromDiagram (Output l) =
fromDiagram (>>> arr (\((x, y), things) > (x, Cons y things))
fromDiagram l Input l) =
fromDiagram (Cons a things) > ((x, things), a)) >>>
arr (\(x, >>>
first (fromDiagram l) > ((y, a), outs))
arr (\((y, outs), a) Below l r) =
fromDiagram (>>>
fromDiagram l > (hlast (Cons x outs), hinit (Cons x outs))) >>>
arr (\(x, outs) fromDiagram r
We wouldn’t want these to get in our way in the middle of the prose, but GHC complains if we don’t put them somewhere.
:: Arrow f => Diagram ins outs f a b > f b c
(┳►)> Diagram ins (b ': outs) f a c
:: Arrow f => Diagram ins outs f a (b, o) > f b c
(┭►)> Diagram ins (o ': outs) f a c
:: Diagram ins outs f a b > f (b, i) c
(┶►)> Diagram (i ': ins) outs f a c
:: Arrow f => Diagram ins outs f a b > f (b, u) c
(╆►)> Diagram (u ': ins) ((b, u) ': outs) f a c
:: Diagram ins1 outs1 f a b
(┧)> Diagram (Init ((b, u) ': outs1)) outs2 f (Last ((b, u) ': outs1)) c
> Diagram (u ': ins1) outs2 f a c
This uses a usersupplied Diagram
to render an image.
image :: Int > Int
> Diagram '[] '[] (>) (Double, Double) JP.PixelRGB8
> JP.Image JP.PixelRGB8
= runST $ do
image w h diagram < JP.newMutableImage w h
img 0 .. h  1] $ \y >
for_ [0 .. w  1] $ \x >
for_ [let x' = fromIntegral x / fromIntegral (w  1)
= fromIntegral y / fromIntegral (h  1) in
y' $ run diagram (x', y')
JP.writePixel img x y JP.freezeImage img
At some point during ICFP2019 in Berlin, I came across a completely unrelated old paper by S. Lovejoy and B. B. Mandelbrot called “Fractal properties of rain, and a fractal model”.
While the model in the paper is primarily meant to model rainfall; the authors explain that it can also be used for rainclouds, since these two phenomena are naturally similarlyshaped. This means it can be used to generate pretty pictures!
While it looked cool at first, it turned out to be an extremely pointless and outdated way to generate pictures like this. But I wanted to write it up anyway since it is important to document failure as well as success: if you’ve found this blogpost searching for an implementation of this paper; well, you have found it, but it probably won’t help you. Here is the GitHub repository.
I found this paper very intriguing because it promises a fractal model with a number of very attractive features:
Most excitingly, it’s possible to do a dimensiongeneric implementation! The code has examples in 2D as well as 3D (xy, time), but can be used without modifications for 4D (xyz, time) and beyond. Haskell’s type system allows capturing the dimension in a type parameter so we don’t need to sacrifice any type safety in the process.
For example, here the dimensiongeneric distance function I used with massiv:
distance :: M.Index ix => ix > ix > Distance
= Distance . sqrt .
distance i j fromIntegral . M.foldlIndex (+) 0 $
> (p  s) * (p  s)) i j M.liftIndex2 (\p s
Here is a 3D version:
However, there must be a catch, right? If it has all these amazing properties, why is nobody using it? I didn’t see any existing implementations; and even though I had a very strong suspicion as to why that was the case, I set out to implement it during Munihac 2019.
As I was working on it, the answer quickly became apparent – the algorithm is so slow that its speed cannot even be considered a tradeoff, its slowness really cancels out all advantages and then some! BitCoin may even be a better use of compute resources. The 30 second video clip I embedded earlier took 8 hours to render on a 16core machine.
This was a bit of a bummer on two fronts: the second one being that I wanted to use this as a vehicle to learn some GPU programming; and it turned out to be a bad fit for GPU programming as well.
At a very highlevel, the algorithm repeats the following steps many, many times:
This sounds great for GPU programming; we could generate a large number of images and then just sum them together. However, the probability distribution from step 2 is problematic. Small (≤3x3) shapes are so common that it seems faster use a CPU (or, you know, 16 CPUs) and just draw that specific region onto a single image.
The paper proposes 3 shapes (which it calls “pulses”). It starts out with just drawing plain opaque circles with a hard edge. This causes some interesting but generally badlooking edges:
It then switches to using circles with smoothed edges; which looks much better, we’re getting properly puffy clouds here:
Finally, the paper discusses drawing smoothedout annuli, which dramatically changes the shapes of the clouds:
It’s mildly interesting that the annuli become hollow spheres in 3D.
Thanks to Alexey for massiv and a massive list of suggestions on my implementation!
]]>However, I was talking with HVR about the
Handle pattern, and the topic of argument order came up. This lead
me to a neat use case for flip
that I hadn’t seen before.
This blogpost should be approachable for beginners, but when you’re completely new to Haskell and some terms are confusing, I would recommend looking at the Type Classes or Learn You a Haskell materials first.
A few extensions are required to show some intermediary results, but – spoiler alert – they turn out to be unnecessary in the end:
{# LANGUAGE MultiParamTypeClasses #}
{# LANGUAGE FlexibleInstances #}
{# LANGUAGE FlexibleContexts #}
In Haskell, it is idiomatic to specify arguments that are unlikely to change in between function calls first.
For example, let’s look at the type of M.insertWith
:
import qualified Data.Map as M
M.insertWith :: Ord k
=> (a > a > a)  ^ Merge values
> k  ^ Key to insert
> a  ^ Value to insert
> M.Map k a  ^ Map to insert into
> M.Map k a  ^ New map
This function allows us to insert an item into a map, or if it’s already there, merge it with an existing element. When we’re doing something related to counting items, we can “specialize” this function by partially applying it to obtain a function which adds a count:
increaseCount :: Ord k
=> k  ^ Key to increment
> Int  ^ Amount to increment
> M.Map k Int  ^ Current count
> M.Map k Int  ^ New count
= M.insertWith (+) increaseCount
And then we can do things like increaseCount "apples" 4 basket
. The extremely
succinct definition of increaseCount
is only possible because functions in
Haskell are always considered curried: every function takes just one element.
However – there is a second idiomatic aspect of argument ordering. For
imperative code, it is common to put the “object” or “handle” first. base
itself is ripe with examples, and packages like network
hold many more:
 From System.IO
hSetBuffering :: Handle > BufferMode > IO ()
hGetBuf :: Handle > Ptr a > Int > IO Int
 From Control.Concurrent.Chan
writeChan :: Chan a > a > IO ()
 From Control.Concurrent.MVar
modifyMVar :: MVar a > (a > IO (a, b)) > IO b
This allows us to easily partially apply functions to a specific “object”, which
comes in useful in where
clauses:
writeSomeStuff :: Chan String > IO ()
= do
writeSomeStuff c "Tuca"
write "Bertie"
write "Speckle"
write where
= writeChan c write
In addition to that, it allows us to replace the type by a record of functions – as I went over in the handle pattern explanation.
However, we end up in a bit of a bind when we want to write succinct toplevel
definitions, like we did with increaseCount
. Imagine we have a Handle
to
our database:
data Handle = Handle
Some mock utility types:
data Tier = Free  Premium
type MemberId = String
And a toplevel function to change a member’s plan:
changePlan :: Handle
> Tier  ^ New plan
> String  ^ Comment
> MemberId  ^ Member to upgrade
> IO ()
= undefined changePlan
If we want a specialized version of this, we need to explicitly name and bind
h
, which sometimes feels a bit awkward:
halloweenPromo1 :: Handle > MemberId > IO ()
= changePlan h Premium "Halloween 2018 promo" halloweenPromo1 h
We sometimes would like to be able to write succinct definitions, such as:
halloweenPromo2 :: Handle > MemberId > IO ()
= specialize changePlan Premium "Halloween 2018 promo" halloweenPromo2
But is this possible? And what would specialize
look like?
Since this is a feature that relates to the type system, it is probably
unsurprising that, yes, this is possible in Haskell. The concept can be
represented as changing a function f
to a function g
:
class Specialize f g where
specialize :: f > g
Of course, a function can be converted to itself:
instance Specialize (a > b) (a > b) where
= id specialize
Furthermore, if a Handle
(a
below) is the first argument, we can skip that
it the converted version and first supply the second argument, namely b
. This
leads us to the following definition:
instance Specialize (a > c) f => Specialize (a > b > c) (b > f) where
= \b > specialize (\a > f a b) specialize f
This is a somewhat acceptable solution, but it’s not great:
Specialize
will be hard to readAllowAmbiguousInstances
may required to defer instance resolution to the
call site of specialize
Again, not show stoppers, but not pleasant either.
The unpleasantness around specialize
is mainly caused by the fact that we need
a typeclass to make this work for multiple arguments. Maybe using some sort of
combinator can give us a simpler solution?
Because we’re lazy, let’s see if GHC has any ideas – we’ll use Typed holes to get a bit more info rather than doing the work ourselves:
halloweenPromo3 :: Handle > MemberId > IO ()
=
halloweenPromo3 `_` Premium `_` "Halloween 2018 promo" changePlan
We get an error, and some suggestions:
posts/20191015flipspecialize.lhs:152:18: error:
• Found hole:
_ :: (Handle > Tier > String > MemberId > IO ()) > Tier > t0
Where: ‘t0’ is an ambiguous type variable
• In the expression: _
In the first argument of ‘_’, namely ‘changePlan `_` Premium’
In the expression:
changePlan `_` Premium `_` "Halloween 2018 promo"
• Relevant bindings include
halloweenPromo3 :: Handle > MemberId > IO ()
(bound at posts/20191015flipspecialize.lhs:151:3)
Valid hole fits include
flip :: forall a b c. (a > b > c) > b > a > c
with flip @Handle @Tier @(String > MemberId > IO ())
(imported from ‘Prelude’ at posts/20191015flipspecialize.lhs:1:1
(and originally defined in ‘GHC.Base’))
seq :: forall a b. a > b > b
with seq @(Handle > Tier > String > MemberId > IO ()) @Tier
(imported from ‘Prelude’ at posts/20191015flipspecialize.lhs:1:1
(and originally defined in ‘GHC.Prim’))
const :: forall a b. a > b > a
with const @(Handle > Tier > String > MemberId > IO ()) @Tier
(imported from ‘Prelude’ at posts/20191015flipspecialize.lhs:1:1
(and originally defined in ‘GHC.Base’))
...
Wait a minute! flip
looks kind of like what we want: it’s type really
converts a function to another function which “skips” the first argument. Is it
possible that what we were looking for was really just… the basic function
flip
?
halloweenPromo4 :: Handle > MemberId > IO ()
=
halloweenPromo4 `flip` Premium `flip` "Halloween 2018 promo" changePlan
We can make the above pattern a bit cleaner by introducing a new operator:
(/$) :: (a > b > c) > (b > a > c)
/$) = flip (
halloweenPromo5 :: Handle > MemberId > IO ()
=
halloweenPromo5 /$ Premium /$ "Halloween 2018 promo" changePlan
Fascinating! I was aware of using flip
in this way to skip a single argument
(e.g. foldr (flip M.increaseCount 1)
), but, in all the time I’ve been
writing Haskell, I hadn’t realized this chained in a usable and nice way.
In a way, it comes down to reading the type signature of flip
in two ways:
flip :: (a > b > c) > (b > a > c)
Convert a function to another function that has the two first arguments flipped. This is the way I am used to reading flip – and also what the name refers to.
flip :: (a > b > c) > b > (a > c)
Partially apply a function to the second argument. After supplying a second argument, we can once again supply a second argument, and so on – yielding an intuitive explanation of the chaining.
It’s also possible to define sibling operators //$
, ///$
, etc., to “skip”
the first N arguments rather than just the first one in a composable way.
Update: Dan Dart pointed out to me that the
sibling operators actually exist under the names of $
, $
, etc. in the
compositionextra package.
… probably not? While it is a mildly interesting trick, unless it becomes a real pain point for you, I see nothing wrong with just writing:
halloweenPromo6 :: Handle > MemberId > IO ()
= changePlan h Premium "Halloween 2018 promo" halloweenPromo6 h
I am one of the organizers of ZuriHac, and last year, we handrolled our own registration system for the event in Haskell. This blogpost explains why we decided to go this route, and we dip our toes into its design and implementation just a little bit.
I hope that the second part is especially useful to less experienced Haskellers, since it is a nice example of a small but useful standalone application. In fact, this was more or less an explicit sidepurpose of the project: I worked on this together with Charles Till since he’s a nice human being and I like mentoring people in daytoday practical Haskell code.
In theory, it should also be possible to reuse this system for other events – not too much of it is ZuriHac specific, and it’s all open source.
Before 2019, ZuriHac registration worked purely based on Google tools and manual labor:
Apart from the fact that the manual labor wasn’t scaling above roughly 300 people, there were a number of practical issues with these tools. The biggest issue was managing the waiting list and cancellations.
You see, ZuriHac is a free event, which means that the barrier to signing up for it is (intentionally and with good reason!) extremely low. Unfortunately, this will always result in a significant amount of people who sign up for the event, but do not actually attend. We try compensating for that by overbooking and offering cancellations; but sometimes it turns out to be hard to get people to cancel as well – especially if it’s hard to reach them.
Google Groups is not great for the purpose we’re using it for: first of all, attendees actually need to go and accept the invitation to join the group. Secondly, do you need a Google Account to join? I still don’t know and have seen conflicting information over the years. Anyway, it’s all a bit adhoc and confusing.
So one of the goals for the new registration system (in addition to reducing work on our side) was to be able to track participant numbers better and improve communication. We wanted to work with an explicit confirmation that you’re attending the event; or with a downloadable ticket so that we could track how many people downloaded this ^{1}.
I looked into a few options (eventbrite, eventlama, and others…) but none of these ticked all the boxes: aside from being free (since we have limited budget). Some features that I wanted were:
With these things in mind, I set out to solve this problem the same the way I usually solve problems: write some Haskell code.
The ZuriHac Registration system (zureg) is a “serverless” application that runs on AWS. It was designed to fit almost entirely in the free tier of AWS; which is why I, for example, picked DynamoDB over a database that’s actually nice to use. We used Brendan Hay’s excellent and extensive amazonka libraries to talk to AWS.
The total cost of having this running for a year, including during ZuriHac itself, totaled up to 0.61 Swiss Francs so I would say that worked out well price wise!
There are two big parts to the application: a fat lambda ^{2} function that provides a number of different endpoints, and a bunch of command line utilities that talk to the different services directly.
All these parts, however, are part of one monolithic codebase which makes it very easy to share code and ensure all behaviour is consistent – globally coherent as some would call it. One big “library” that has welldefined module boundaries and multiple lightweight “executables” is how I like to design applications in Haskell (and other languages).
First, I’d like to go into how the project is built and compiled. It’s not something I’m proud of, but I do think it makes a good cookbook on how to do things the hard way.
The main hurdle is that we wanted want to run our Haskell code on Lambda, since this is much cheaper than using an EC2 instance: the server load is very bursty with long periods (days up to weeks) of complete inactivity.
I wrote a bunch of the zureg code before some HaskellonLambda solutions popped up, so it is all done from scratch – and it’s surprisingly short. However, if I were to start a new project, I would probably use one of these frameworks:
Converting zureg to use of these frameworks is something I woulld like to look into at some point, if I find the time. The advantage of doing things from scratch, however, is that it serves the educational purposes of this blogpost very well!
Our entire serverless framework is currently contained in a single 138line file.
From a bird’s eye view:
We define a docker image that’s based on Amazon Linux – this ensures we’re using the same base operating system and system libraries as Lambda, so our binary will work there.
We compile our code inside a docker container and copy out the resulting executable to the host.
We zip this up together with a python script that just forwards requests to the Haskell process.
We upload this zip to S3 and our cloudformation takes care of setting up the rest of the infrastructure.
I think this current situation is still pretty manageable since the application is so small; but porting it to something nicer like Nix is definitely on the table.
The data model is not too complex. We’re using an event sourcing approach: this means that our source of truth is really an appendonly series of events rather than a traditional row in a database that we update. These events are stored as plain JSON, and we can define them in pure Haskell:
And then we just have a few handwritten functions in the database module:
This gives us a few things for free; most importantly if something goes wrong we can go in and check what events led the user to get into this invalid state.
This code is backed by the eventful and eventfuldynamodb libraries, in addition to some custom queries.
While our admins can interact with the system using the CLI tooling, registrants interact with the system using the webapp. The web application is powered by a fat lambda.
Using this web app, registrants can do a few things:
In addition to these routes used by participants, there’s a route used for ticket scans – which we’ll talk about next.
Now that we have participant tickets, we need some way to process them at the event itself.
scanner.js is a small JavaScript tool that does this for us. It uses the device’s webcam to scan QR codes – which is nice because this means we can use either phones, tablets or a laptop to scan tickets at the event, the device just needs a modern browser version. It’s built on top of jsQR.
The scanner intentionally doesn’t do much processing – it just displays a fullscreen video of the webcam and searches for a QR code using an external library. Once we get a hit for a QR code, we poll the lambda again to retrieve some information (participant name, TShirt size) and overlay that on top of the video.
This is useful because now the people working at the registration desk can see, as demonstrated in the image above, that I registered too late and therefore should only pick up a TShirt on the second day.
There is a lot of room for improvement, but the fact that it had zero technical issues during registration or the event makes me very happy. Off the top of my head, here are some TODOs for next years:
Other than that, there are some nonfunctional TODOs:
Any contributions in these areas are of course welcome!
Lastly, there’s the question of whether or not it makes sense for other events to use this. I discussed this briefly with Franz Thoma, one of the organizers of Munihac, who expressed similar gripes about evenbrite.
As it currently stands, zureg is not an offtheshelf solution and requires some customization for your event – meaning it only really makes sense for Haskell events. On the other hand, there are a few people who prefer doing this over mucking around in settings dashboard that are hugely complicated but still do not provide the necessary customization.
I realize this is a bit creepy, and fortunately it turned out not to be necessary since we could do the custom confirmation flow.↩︎
In serverless terminology, it seems to common to refer to lambdas that deal with more than one specific endpoint or purpose as “fat lambdas”. I think this distracts from the issue a bit, since it’s more important to focus on how the code works and whether or not you can reuse it rather than how it is deployed – but coming from a functional programming perspective I very much enjoy the sound of “fat lambda”.↩︎
If I was going to build a game, I knew I wanted it to be webbased – there was no doubt in mind about this:
There are of course some downsides to webbased games as well. For me, the main disavantage is that the dominant language is still JavaScript (which I am not a big fan of, to put it mildly).
Fortunately there are a good number of languages that compile down to JavaScript these days. The two big contendors were Haskell (through GHCJS) and PureScript (I would go as far as calling PureScript a Haskell dialect, since they are so similar).
The big advantage of using GHCJS is that you’re able to run Haskell on the backend and on the frontend, so you can share common code.
However, I wanted to write a simple game without any sort of backend (which, of course, makes it significantly easier to host as well). PureScript produces vastly smaller JavaScript files, and I wanted to learn the language a bit to see how it compares with Haskell, so I decided to give that a try.
I did not consider Elm because it’s a bit further removed from Haskell, and my main focus was still building a game; not learning a new language. I have heard a lot of good things about it though, so maybe that’s what I should try next.
One of the last games I played was the remake of the masterpiece Katamari Damacy on the Nintendo Switch.
Inspired by Katamari Damacy, I wanted to make a 2D version that had a similar feeling to it. I decided relatively quickly that the core mechanic of the game would be to put different kinds of objects together in bizarre ways, hopefully amusing people along the way.
With that in mind, I immediately focused on this core mechanic since I wanted to know whether it could actually be fun or not.
I started by doing a simple exhaustive search over all the ways you can overlay two sprites, minimizing the average colour distance. This worked remarkably well, and I didn’t end up finetuning the results much more after that.
It did lead to some performance issues for larger sprites, so I fixed that by mipmapping: for larger sprites, I first do an exhaustive search at a much lower resolution, then I use these results to do a local search in that neighbourhood at higher resolutions. This is not guaranteed to give the best results; but that doesn’t matter too much for this game: we just want a good enough result.
I wanted to also try an approach based on simulated annealing but didn’t get around to it. If someone wants to try this, you’re more than welcome to make a contribution!
At this point, I was getting amusing results, but I wasn’t sure how to make this into a game yet. I didn’t want to make it into action game, and felt like a puzzle game would fit better. Then, I realized the comedic effect would be even better if I combined the names of the different sprites as well.
This automatically adds a sort of puzzle mechanic to the game as well, since you can now only merge certain objects.
This brought me to the next obstacle – I knew I would need a large number of consistent sprites to use as art in the game. I browsed around opengameart.org for a bit, but did not really find anything promising. I also did not want to pay an artist, because I wanted to keep this a free game, without advertisements and the like.
Then it dawned to me that there already is a great collection of consistent sprites that even come with the names attached to them – emoji! I found the free EmojiOne set and started with that. But when I looked into it a bit, I found this weird snippet in their free licensing info:
3.4 What can’t you do with the JoyPixels/EmojiOne Properties under this agreement?
…
(I) Include properties in open source projects.
…
What nonsense is this? I am allowed to use it in my noncommercial project if I give attribution, but not if I want to have the option to open source my game?
This pissed me off and I started looking for alternatives. At that point, however, I already knew emoji were a good direction so it was easier. I ended up switching to Google’s Noto font. I liked the sprites a little bit less but at least the license made sense.
At this point I built a demo that simply allowed you to drag around a bunch of different objects and merge them. It was certainly amusing, but it did not really feel like a “game” to me yet. However, I shared this demo with a couple of people and they all really liked it. This was very encouraging.
The next weekend, I tried to turn this into a Tetris or 2048like puzzle game, but this ended up being very confusing and not that much fun. Ironically, the nongame was more fun!
So, I decided to go back to that and just add a very simple economy on top of it (buying and selling things) to make it a bit more interesting. After I added that, I was quite happy with the flow of the game.
The rules were still a bit unclear to people I showed it to (what things can you merge together?), so I added the hints at the top of the cards and an interactive tutorial.
In retrospect, I am happy with PureScript as a language and would recommend it if you’re looking into putting a simple nobackend webbased game together, and you already know Haskell.
There were a few issues I ran into with the language:
I still prefer lazy languages, and this bit me a few times. In particular, I wrote a few monadic recursive functions without being aware of the tailrec package. This caused stack overflows in my code, but I only saw these on my phone, which made it extremely hard to debug.
The error messages that the compiler emits are horrible at times. I feel like this is an area where I could contribute a bunch of code myself, but I’m not sure if I’ll ever have time for that.
There are also a lot of things I like:
Working with the FFI to call JavaScript is seamless and easy.
Halogen is an amazing framework that made building the UI trivial.
Once you figure out how to, the resulting JavaScript is actually very easy to debug using Firefox’s or Chromium’s developer tools.
The story of this library began with last year’s ICFP contest. For this contest, the goal was to build a program that orchestrates a number of nanobots to build a specific minecraftlike structure, as efficiently as possible. I was in Japan at the time, working remotely from the Tsuru Capital office, and a group of them decided to take part in this contest.
I had taken part in the 2017 ICFP contest with them, but this year I was not able to work on this at all since the ICFP contest took place in the same weekend as my girlfriends’ birthday. We went to Fujikawaguchiko instead – which I would recommend to anyone interested in visiting the Fuji region. I ended up liking it more than Hakone, where I was a year or two ago.
Anyway, after the contest we were discussing how it went and Alex thought a key missing piece for them was a specific algorithm called dynamic connectivity. Because this is not a trivial algorithm to put together, we ended up using a less optimal version which still contained some bugs. In the weeks after the contest ended Alex decided to continue looking into this problem and we ended up putting this library together.
The dynamic connectivity problem is very simply explained to anyone who is at least a little familiar with graphs. It comes down to building a datastructure that allows adding and removing edges to a graph, and being able to answer the question “are these two vertices (transitively) connected” at any point in time.
This might remind you of the unionfind problem. Unionfind, after all, is a good solution to incremental dynamic connectivity. In this context, incremental means that edges may only be added, not removed. A situation where edges may be added and removed is sometimes referred to as fully dynamic connectivity.
Like unionfind, there is unfortunately no known persistent version of this algorithm without sacrificing some performance. An attempt was made [to create a fast, persistent union find] but I don’t think we can consider this successful in the Haskell sense of purity since the structure proposed in that paper is inherently not threadsafe; which is one of the reasons to pursue persistence in the first place.
Anyway, this is why the library currently only provides a mutable interface.
The library uses the PrimMonad
from the
primitive library to ensure you
can use our code both in IO
and ST
, where the latter lets us reclaim purity.
Let’s walk through a simple example of using the library in plain IO
.
import qualified Data.Graph.Dynamic.Levels as GD
import qualified Data.Tree as T
main :: IO ()
= do
main < GD.empty' graph
Let’s consider a fictional map of Hawaiian islands.
mapM_ (GD.insert_ graph)
"Akanu", "Kanoa", "Kekoa", "Kaiwi", "Onakea"]
["Akanu" "Kanoa"
GD.link_ graph "Akanu" "Kaiwi"
GD.link_ graph "Akanu" "Onakea"
GD.link_ graph "Kaiwi" "Onakea"
GD.link_ graph "Onakea" "Kanoa"
GD.link_ graph "Kanoa" "Kekoa" GD.link_ graph
The way the algorithm works is by keeping a spanning forest at all times. That way we can quickly answer connectivity questions: if two vertices belong to the same tree (i.e., they share the same root), they are connected.
For example, can we take ferries from Kaiwi to Kekoa? The following statement
prints True
.
"Kaiwi" "Kekoa" >>= print GD.connected graph
Such a question, however, could have been answered by a simpler algorithm such as union find which we mentioned before. Union find is more than appropriate if edges can only be added to a graph, but it cannot handle cases where we want to delete edges. Let’s do just so:
"Kaiwi" "Akanu" GD.cut_ graph
In a case such as the one above, where the deleted edge is not part of the spanning forest, not much interesting happens, and the overall connectivity is not affected in any way.
However, it gets interesting when we delete an edge that is part of the spanning tree. When that happens, we kick off a search to find a “replacement edge” in the graph that can restore the spanning tree.
"Onakea" "Akanu" GD.cut_ graph
In our example, we can replace the deleted Akanu  Onakea edge with the Kanoa  Onakea edge. Finding a replacement edge is unsurprisingly the hardest part of the problem, and a sufficiently effecient algorithm was only described in 1998 by Holm, de Lichtenberg and Thorup in this paper.
The algorithm is a little complex, but the paper is wellwritten, so I’ll just stick with a very informal and handwavey explanation here:
If an edge is cut from the spanning forest, then this turns one spanning tree in the forest into two components.
The algorithm must consider all edges in between these two components to find a replacement edge. This can be done be looking at the all the edges adjacent to the smaller of the two components.
Reasonable amortized complexity, O(log² n), is achieved by “punishing” edges that are considered but not taken, so we will consider them less frequently in subsequent calls.
Back to our example. When we go on to delete the Onakea  Kanoa edge, we cannot find a replacement edge, and we are left with a spanning forest with two components.
"Onakea" "Kanoa" GD.cut_ graph
We can confirm this by asking the library for the spanningforest and then using
the very handy drawForest
from Data.Tree
to visualize it:
>>= putStr . T.drawForest GD.spanningForest graph
This prints:
Kanoa

+ Akanu

` Kekoa
Onakea

` Kaiwi
Let’s restore connectivity to leave things in proper working order for the residents of our fictional island group, before closing the blogpost.
"Akanu" "Kaiwi" GD.link_ graph
For finishing words, what are some future directions for this library? One of the authors of the original paper, M. Thorup, wrote a followup that improves the theoretical space and time complexity a little. This seems to punish us with bad constant factors in terms of time performance – but it is probably still worth finishing because it uses significantly less memory. Contributions, as always, are welcome. :)
]]>Update: I gave a talk about this blogpost at the Haskell eXchange 2018 on the 11th of October 2018. You can watch the video here. Note that you will need to create an account in on the skillsmatter website in order to watch the recording.
This post makes a bit of a departure from the “practical Haskell” I usually try to write about, although – believe it or not – this blogpost actually originated from a very practical origin ^{1}.
This blogpost is a literate Haskell file, which means you can just download it here and load it into GHCi to play around with it. In this case, you can also verify the properties we will be talking about (yes, GHC as a proof checker). Since we are dipping our toes into dependent types territory here, we will need to enable some extensions that are definitely a bit more on the advanced side.
{# LANGUAGE DataKinds #}
{# LANGUAGE GADTs #}
{# LANGUAGE KindSignatures #}
{# LANGUAGE PolyKinds #}
{# LANGUAGE ScopedTypeVariables #}
{# LANGUAGE TypeFamilies #}
{# LANGUAGE TypeOperators #}
{# LANGUAGE UndecidableInstances #}
Since the goal of this blogpost is mainly educational, we will only use a few
standard modules and generally define things ourselves. This
also helps us to show that there is no
magic
going on
behind the scenes: all termlevel functions in this file are total and
compile fine with Wall
.
import Data.List (intercalate, minimumBy)
import Data.List.NonEmpty (NonEmpty (..))
import qualified Data.List.NonEmpty as NonEmpty
import Data.Ord (comparing)
I assume most readers will be at least somewhat familiar with the standard lengthindexed list:
data Nat = Zero  Succ Nat deriving (Show)
data Vec (n :: Nat) a where
VNull :: Vec 'Zero a
VCons :: a > Vec n a > Vec ('Succ n) a
These vectors carry their length in their types. In GHCi:
*Main> :t VCons "Hello" (VCons "World" VNull)
Vec ('Succ ('Succ 'Zero)) [Char]
This blogpost defines a similar way to deal with binomial heaps. Binomial heaps are one of my favorite data structures because of their simple elegance and the fascinating way their structure corresponds to binary numbers.
We will combine the idea of Peano numberindexed lists with the idea that binomial heaps correspond to binary numbers to lift binary numbers to the type level. This is great because we get O(log(n)) size and time in places where we would see O(n) for the Peano numbers defined above (in addition to being insanely cool). In GHCi:
*Main> :t pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
pushHeap 'd' $ pushHeap 'e' emptyHeap
Heap ('B1 ('B0 ('B1 'BEnd))) Char
Where 101 ^{2} is, of course, the binary representation of the number 5.
Conveniently, 101 also represents the basics of a subject. So the title of this blogpost works on two levels, and we present an introductorylevel explanation of a nontrivial (and again, insanely cool) example of dependent Haskell programming.
If I perform an appropriate amount of handwaving and squinting, I feel like there are two ways to work with these strongerthanusual types in Haskell. We can either make sure things are correct by construction, or we can come up with a proof that they are in fact correct.
The former is the simpler approach we saw in the Vec
snippet: by using
the constructors provided by the GADT, our constraints are always satisfied.
The latter builds on the singletons approach introduced by Richard Eisenberg
and Stephanie Weirich.
We need both approaches for this blogpost. We assume that the reader is somewhat familiar with the first one and in this section we will give a brief introduction to the second one. It is in no way intended to be a full tutorial, we just want to give enough context to understand the code in the blogpost.
If we consider a closed type family for addition of natural numbers (we are
using an N
prefix since we will later use B
for addition of binary numbers):
type family NAdd (x :: Nat) (y :: Nat) :: Nat where
NAdd ('Succ x) y = 'Succ (NAdd x y)
NAdd 'Zero y = y
We can trivially define the following function:
data Proxy a = Proxy
cast01 :: Proxy (NAdd 'Zero x) > Proxy x
= id cast01
NAdd 'Zero x
is easily reduced to x
by GHC since it is simply a clause of
the type family, so it accepts the definition id
. However, if we try to write
cast02 :: Proxy (NAdd x 'Zero) > Proxy x
= id cast02
We run into trouble, and GHC will tell us:
Couldn't match type ‘x’ with ‘NAdd x 'Zero’
We will need to prove to GHC that these two types are equal – commutativity doesn’t come for free! This can be done by providing evidence for the equality by way of a GADT constructor ^{3}.
data EqualityProof (a :: k) (b :: k) where
QED :: EqualityProof a a
type a :~: b = EqualityProof a b
Take a minute to think about the implications this GADT has – if we can
construct a QED
value, we are actually providing evidence that the two types
are equal. We assume that the two types (a
and b
) have the same kind k
^{4}.
The QED
constructor lives on the termlevel though, not on the typelevel. We
must synthesize this constructor using a termlevel computation. This means we
need a termlevel representation of our natural numbers as well. This is the
idea behind singletons and again, a much better explanation is available in
said paper and some talks, but I
wanted to at least provide some intuition here.
The singleton for Nat
is called SNat
and it’s easy to see that each Nat
has a unique SNat
and the other way around:
data SNat (n :: Nat) where
SZero :: SNat 'Zero
SSucc :: SNat n > SNat ('Succ n)
We can use such a SNat
to define a proof for what we are trying to accomplish.
Since this proof can be passed any n
in the form of an SNat
, it must be
correct for all n
.
lemma1 :: SNat n > NAdd n 'Zero :~: n
GHC can figure out the base case on its own by reducing NAdd 'Zero 'Zero
to
'Zero
:
SZero = QED lemma1
And we can use induction to complete the proof. The important trick here is
that in the body of the pattern match on EqualityProof a b
, GHC knows that a
is equal to b
.
SSucc n) = case lemma1 n of QED > QED lemma1 (
This can be used to write cast02
:
cast02 :: SNat x > Proxy (NAdd x 'Zero) > Proxy x
= case lemma1 snat of QED > id cast02 snat
cast02
takes an extra parameter and there are several ways to synthesize this
value. The common one is a typeclass that can give us an SNat x
from a Proxy x
. In this blogpost however, we keep things simple and make sure we always
have the right singletons on hand by passing them around in a few places. In
other words: don’t worry about this for now.
A binomial heap consists of zero or more binomial trees. I will quote the text from the Wikipedia article here since I think it is quite striking how straightforward the definition translates to GADTs that enforce the structure:
data Tree (k :: Nat) a where
Tree :: a > Children k a > Tree k a
data Children (k :: Nat) a where
CZero :: Children 'Zero a
CCons :: Tree k a > Children k a > Children ('Succ k) a
Some illustrations to make this a bit more clear:
This is definitely a very good example of the correctness by construction approach I talked about earlier: it is simply impossible to create a tree that does not have the right shape.
Empty trees do not exist according to this definition. A singleton tree is easy to create:
singletonTree :: a > Tree 'Zero a
= Tree x CZero singletonTree x
We only need to define one operation on trees, namely merging two trees.
A tree of order k
has 2ᵏ
elements, so it makes sense that merging two trees
of order k
creates a tree of order k+1
. We can see this in the type
signature as well:
mergeTree :: Ord a => Tree k a > Tree k a > Tree ('Succ k) a
Concretely, we construct the new tree by taking either the left or the right tree and attaching it as new child to the other tree. Since we are building a heap to use as a priority queue, we want to keep the smallest element in the root of the new tree.
@(Tree lroot lchildren) r@(Tree rroot rchildren)
mergeTree l lroot <= rroot = Tree lroot (CCons r lchildren)
 otherwise = Tree rroot (CCons l rchildren)
With these trees defined, we can move on to binomial heaps.
While binomial trees are interesting on their own, they can really only represent collections that have a number of elements that are exactly a power of two.
Binomial heaps solve this in a surprisingly simple way. A binomial heap is a collection of binomial trees where we may only have at most one tree for every order.
This is where the correspondence with binary numbers originates. If we have a binomial heap with 5 elements, the only way to do this is to have binomial trees of orders 2 and 0 (2² + 2⁰ = 5).
We start out by defining a simple datatype that will be lifted to the
kindlevel, just as we did with Nat
:
data Binary
= B0 Binary
 B1 Binary
 BEnd
deriving (Show)
It’s important to note that we will represent binary numbers in a righttoleft order since this turns out to match up more naturally with the way we will be defining heaps.
For example, the type:
'B0 ('B1 ('B1 'BEnd))
represents the number 6 (conventionally written 110).
I think it is fairly common in Haskell for a developer to play around with different ways of representing a certain thing until you converge on an elegant representation. This is many, many times more important when we are dealing with dependentlytyped Haskell.
Inelegant and awkward data representations can make termlevel programming clunky. Inelegant and awkward type representations can make typelevel programming downright infeasible due to the sheer amount of lemmas that need to be proven.
Consider the relative elegance of defining a type family for incrementing a binary number that is read from the right to the left:
type family BInc (binary :: Binary) :: Binary where
BInc 'BEnd = 'B1 'BEnd
BInc ('B0 binary) = 'B1 binary
BInc ('B1 binary) = 'B0 (BInc binary)
Appendix 3 contains an (unused) implementation of incrementing lefttoright binary numbers. Getting things like this to work is not too much of a stretch these days (even though GHC’s error messages can be very cryptic). However, due to the large amount of type families involved, proving things about it presumably requires ritually sacrificing an inappropriate amount of Agda programmers while chanting Richard Eisenberg’s writings.
To that end, it is almost always worth spending time finding alternate representations that work out more elegantly. This can lead to some arbitrary looking choices – we will see this in full effect when trying to define CutTree further below.
Addition is not too hard to define:
type family BAdd (x :: Binary) (y :: Binary) :: Binary where
BAdd 'BEnd y = y
BAdd x 'BEnd = x
BAdd ('B0 x) ('B0 y) = 'B0 (BAdd x y)
BAdd ('B1 x) ('B0 y) = 'B1 (BAdd x y)
BAdd ('B0 x) ('B1 y) = 'B1 (BAdd x y)
BAdd ('B1 x) ('B1 y) = 'B0 (BInc (BAdd x y))
Let’s quickly define a number of examples
type BZero = 'B0 'BEnd
type BOne = BInc BZero
type BTwo = BInc BOne
type BThree = BInc BTwo
type BFour = BInc BThree
type BFive = BInc BFour
This allows us to play around with it in GHCi:
*Main> :set XDataKinds
*Main> :kind! BAdd BFour BFive
BAdd BFour BFive :: Binary
= 'B1 ('B0 ('B0 ('B1 'BEnd)))
Finally, we define a corresponding singleton to use later on:
data SBin (b :: Binary) where
SBEnd :: SBin 'BEnd
SB0 :: SBin b > SBin ('B0 b)
SB1 :: SBin b > SBin ('B1 b)
Our heap will be a relatively simple wrapper around a recursive type called
Forest
. This datastructure follows the definition of the binary numbers
fairly closely, which makes the code in this section surprisingly easy and we
end up requiring no lemmas or proofs whatsoever.
A Forest k b
refers to a number of trees starting with (possibly) a tree of
order k
. The b
is the binary number that indicates the shape of the forest
– i.e., whether we have a tree of a given order or not.
Using a handwavy but convenient notation, this means that Forest 3 101 refers to binomial trees of order 3 and 5 (and no tree of order 4).
data Forest (k :: Nat) (b :: Binary) a where
FEnd :: Forest k 'BEnd a
F0 :: Forest ('Succ k) b a > Forest k ('B0 b) a
F1 :: Tree k a > Forest ('Succ k) b a > Forest k ('B1 b) a
Note that we list the trees in increasing order here, which contrasts to
Children
, where we listed them in decreasing order. You can see
this in the way we are removing layers of 'Succ
as we add more
constructors. This is the opposite of what happens in Children
.
The empty forest is easily defined:
emptyForest :: Forest k 'BEnd a
= FEnd emptyForest
insertTree
inserts a new tree into the forest. This might require
merging two trees together – roughly corresponding to carrying in the binary
increment operation.
insertTree :: Ord a
=> Tree k a > Forest k b a
> Forest k (BInc b) a
FEnd = F1 s FEnd
insertTree s F0 f) = F1 s f
insertTree s (F1 t f) = F0 (insertTree (mergeTree s t) f) insertTree s (
Similarly, merging two forests together corresponds to adding two binary numbers together:
mergeForests :: Ord a
=> Forest k lb a > Forest k rb a
> Forest k (BAdd lb rb) a
FEnd rf = rf
mergeForests FEnd = lf
mergeForests lf F0 lf) (F0 rf) = F0 (mergeForests lf rf)
mergeForests (F1 l lf) (F0 rf) = F1 l (mergeForests lf rf)
mergeForests (F0 lf) (F1 r rf) = F1 r (mergeForests lf rf)
mergeForests (F1 l lf) (F1 r rf) =
mergeForests (F0 (insertTree (mergeTree l r) (mergeForests lf rf))
It’s worth seeing how the different branches in insertTree
and mergeForests
match up almost 1:1 with the different clauses in the definition of the type
families BInc
and BAdd
. If we overlay them visually:
That is the intuitive explanation as to why no additional proofs or typelevel trickery are required here.
Here is an informal illustration of what happens when we don’t need to merge any
trees. The singleton Forest
on the left is simply put in the empty F0
spot
on the right.
When there is already a tree there, we merge the trees using mergeTree
and
carry that, in a very similar way to how carrying works in the addition of
binary numbers:
The Forest
structure is the main workhorse and Heap
is just a simple wrapper
on top of that, where we start out with a tree of order 0:
newtype Heap (b :: Binary) a = Heap {unHeap :: Forest 'Zero b a}
The operations on Heap
are also simple wrappers around the previously defined
functions:
emptyHeap :: Heap 'BEnd a
= Heap emptyForest emptyHeap
pushHeap :: Ord a => a > Heap b a > Heap (BInc b) a
Heap forest) = Heap (insertTree (singletonTree x) forest) pushHeap x (
mergeHeap :: Ord a => Heap lb a > Heap rb a > Heap (BAdd lb rb) a
Heap lf) (Heap rf) = Heap (mergeForests lf rf) mergeHeap (
We are now ready to show this off in GHCi again:
*Main> :t pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
pushHeap 'd' $ pushHeap 'e' emptyHeap
Heap ('B1 ('B0 ('B1 'BEnd))) Char
We can also take a look at the internals of the datastructure using a custom show instance provided in the appendix 2:
*Main> pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
pushHeap 'd' $ pushHeap 'e' emptyHeap
(tree of order 0)
'a'
(no tree of order 1)
(tree of order 2)
'b'
'd'
'e'
'c'
Neat!
I think it’s interesting that we have implemented an appendonly heap without even requiring any lemmas so far. It is perhaps a good illustration of how appendonly datastructures are conceptually much simpler.
Things get significantly more complicated when we try to implement popping the smallest element from the queue. For reference, I implemented the current heap in a couple of hours, whereas I worked on the rest of the code on and off for about a week.
Let’s look at a quick illustration of how popping works.
We first select the tree with the smallest root and remove it from the heap:
We break up the tree we selected into its root (which will be the element that is “popped”) and its children, which we turn into a new heap.
We merge the remainder heap from step 1 together with the new heap we made out of the children of the removed tree:
The above merge requires carrying twice.
We will start by implementing step 2 of the algorithm above since it is a bit easier. In this step, we are taking all children from a tree and turning that into a new heap.
We need to keep all our invariants intact, and in this case this means tracking
them in the type system. A tree of k
has 2ᵏ
elements. If we remove the
root, we have k
children trees with 2ᵏ  1
elements in total. Every child
becomes a tree in the new heap. This means that the heap contains k
full
trees, and its shape will be written as k
“1”s. This matches our math: if you
write k
“1”s, you get the binary notation of 2ᵏ  1
.
Visually:
We introduce a type family for computing n
“1”s:
type family Ones (n :: Nat) :: Binary where
Ones 'Zero = 'BEnd
Ones ('Succ n) = 'B1 (Ones n)
We will use a helper function childrenToForest_go
to maintain some invariants.
The wrapper childrenToForest
is trivially defined but its type tells us a
whole deal:
childrenToForest :: Children n a
> Forest 'Zero (Ones n) a
=
childrenToForest children SZero (childrenSingleton children) FEnd children childrenToForest_go
We use childrenSingleton
to obtain a singleton for n
.
childrenSingleton :: Children n a > SNat n
CZero = SZero
childrenSingleton CCons _ c) = SSucc (childrenSingleton c) childrenSingleton (
The tricky bit is that the list of trees in Children
has them in descending
order, and we want them in ascending order in Forest
. This means we will
have to reverse the list.
We can reverse a list easily using an accumulator in Haskell. In order to
maintain the type invariants at every step, we will increase the size of the
accumulator as we decrease the size of the children. This can be captured by
requiring that their sum remains equal (m ~ NAdd x n
).
childrenToForest_go :: m ~ NAdd x n
=> SNat x
> SNat n
> Forest n (Ones x) a
> Children n a
> Forest 'Zero (Ones m) a
@SZero acc CZero = childrenToForest_go xnat _snat
I will not always go into detail on how the lemmas apply but let’s do it here nonetheless.
For the base case, we simply want to return our accumulator. However, our
accumulator has the type Forest n (Ones x)
and we expect something of the type
Forest n (Ones m)
. Furthermore, we know that:
n ~ 'Zero, m ~ NAdd x n
⊢ m ~ NAdd x 'Zero
We need to prove that x ~ m
in order to do the cast from Forest n (Ones x)
to Forest n (Ones m)
.
We can do so by applying lemma1
to x
(the latter represented here by
xnat
). This gives us lemma1 xnat :: NAdd x 'Zero :~: n
. Combining this
with what we already knew:
m ~ NAdd x 'Zero, NAdd x 'Zero ~ n
⊢ m ~ x
…which is what we needed to know.
case lemma1 xnat of QED > acc
The inductive case is a bit harder and requires us to prove that:
m ~ NAdd x n, m ~ NAdd x n, n ~ 'Succ k
⊢ Ones m ~ 'B1 (Ones (NAdd x k))
GHC does a great job and ends up with something like:
Ones (NAdd x (Succ k)) ~ 'B1 (Ones (NAdd x k))
Which only requires us to prove commutativity on NAdd
. You can see that
proof in lemma2
a bit further below. This case also illustrates well how we
carry around the singletons as inputs for our lemmas and call on them when
required.
SSucc nnat) acc (CCons tree children) =
childrenToForest_go xnat (case lemma2 xnat nnat of
QED > childrenToForest_go
SSucc xnat)
(
nnatF1 tree acc)
( children
Proving lemma2
is trivial… once you figure out what you need to prove and
how all of this works.
It took me a good amount of time to put the different pieces together in my
head. It is not only a matter of proving the lemma: restructuring the code in
childrenToForest_go
leads to different lemmas you can attempt to prove, and
figuring out which ones are feasible is a big part of writing code like this.
lemma2 :: SNat n > SNat m > NAdd n ('Succ m) :~: 'Succ (NAdd n m)
SZero _ = QED
lemma2 SSucc n) m = case lemma2 n m of QED > QED lemma2 (
These are some minor auxiliary functions we need to implement on Vec
. We
mention them here because we’ll also need two type classes dealing with
nonzeroness.
First, we need some sort of map
, and we can do this by implementing the
Functor
typeclass.
instance Functor (Vec n) where
fmap _ VNull = VNull
fmap f (VCons x v) = VCons (f x) (fmap f v)
Secondly, we need a very simple function to convert a Vec
to a list. Note that
this erases the information we have about the size of the list.
vecToList :: Vec n a > [a]
VNull = []
vecToList VCons x v) = x : vecToList v vecToList (
Using vecToList
, we can build a function to convert a nonempty Vec
to a
NonEmpty
list. This uses an additional NNonZero
typeclass.
vecToNonEmpty :: NNonZero n ~ 'True => Vec n a > NonEmpty a
VCons x v) = x : vecToList v vecToNonEmpty (
type family NNonZero (n :: Nat) :: Bool where
NNonZero 'Zero = 'False
NNonZero ('Succ _) = 'True
Nonzeroness can be defined on binary numbers as well:
type family BNonZero (b :: Binary) :: Bool where
BNonZero 'BEnd = 'False
BNonZero ('B1 b) = 'True
BNonZero ('B0 b) = BNonZero b
You might be asking why we cannot use a simpler type, such as:
vecToNonEmpty :: Vec ('Succ n) a > NonEmpty a
It we use this, we run into trouble when trying to prove that a Vec
is not
empty later on. We would have to construct a singleton for n
, and we only
have something that looks a bit like ∃n. 'Succ n
. Trying to get the n
out
of that requires some form of nonzeroness constraint… which would be exactly
what we are trying to avoid by using the simpler type. ^{5}
The minimal element will always be the root of one of our trees. That means we have as many choices for our minimal element as there are trees in our heap. We need some way to write down this number as a type.
Since we have a tree for every 1 in our binary number, we can define the number of trees as the popcount of the binary number.
In a weird twist of fate, you can also pretend this stands for “the count of trees which we can pop”, which is exactly what we will be using it for.
type family Popcount (b :: Binary) :: Nat where
Popcount 'BEnd = 'Zero
Popcount ('B1 b) = 'Succ (Popcount b)
Popcount ('B0 b) = Popcount b
Popcount
can be used to relate the nonzeroness of a natural number, and the
nonzeroness of a binary number.
lemma3 :: BNonZero b ~ 'True
=> SBin b
> NNonZero (Popcount b) :~: 'True
SB1 _) = QED
lemma3 (SB0 b) = case lemma3 b of QED > QED lemma3 (
In addition to caring about the popcount
of a binary number, we are sometimes
interested in its width
(number of bits). This is also easily captured in a
type family:
type family Width (binary :: Binary) :: Nat where
Width 'BEnd = 'Zero
Width ('B0 binary) = 'Succ (Width binary)
Width ('B1 binary) = 'Succ (Width binary)
That is a fair amount of type families so far. To make things a bit more clear,
here is an informal visual overview of all the type families we have defined,
including BDec
(binary decrement, defined further below).
Now, popping the smallest element from the heap first involves cutting a single tree from the forest inside the heap. We take the root of that tree and merge the children of the tree back together with the original heap.
However, just selecting (and removing) a single tree turns out to be quite an endeavour on its own. We define an auxiliary GADT which holds the tree, the remainder of the heap, and most importantly, a lot of invariants.
Feel free to scroll down to the datatype from here if you are willing to assume the specific constraint and types are there for a reason.
The two first fields are simply evidence singletons that we carry about. k
stands for the same concept as in Forest
; it means we are starting with an
order of k
. x
stands for the index of the tree that was selected.
This means the tree that was selected has an order of NAdd k x
, as we can see
in the third field. If the remainder of the heap is Forest k b a
, its shape
is denoted by b
and we can reason about the shape of the original heap.
The children of tree (Tree (NAdd k x) a
) that was selected will convert to a
heap of shape Ones x
. We work backwards from that to try and write down the
type for the original heap. The tree (Tree (NAdd k x) a
) would form a
singleton heap of shape BInc (Ones x)
. The remainder (i.e., the forest with
this tree removed) had shape b
, so we can deduce that the original shape of
the forest must have been BAdd b (BInc (Ones x))
.
Finally, we restructure the type in that result to BInc (BAdd b (Ones x))
.
The restructuring is trivially allowed by GHC since it just requires applying
the necessary type families. The restructured type turns out to be more easily
usable in the places where we caseanalyse CutTree
further down in this
blogpost.
We also carry a constraint here that seems very arbitrary and relates the widths of two binary numbers. It is easier to understand from an intuitive point of view: the new (merged) heap has the same width as the original heap. Why is it here?
Well, it turns out we will need this fact further down in a function definition. If we can conclude it here by construction in the GADT, we avoid having to prove it further down.
Of course, I know that I will need this further down because I already have the code compiling. When writing this, there is often a very, very painful dialogue in between different functions and datatypes, where you try to mediate by making the requested and expected types match by bringing them closer together step by step. In the end, you get a monstrosity like:
data CutTree (k :: Nat) (b :: Binary) a where
CutTree
:: Width (BAdd b (Ones x)) ~ Width (BInc (BAdd b (Ones x)))
=> SNat x
> SNat k
> Tree (NAdd k x) a
> Forest k b a
> CutTree k (BInc (BAdd b (Ones x))) a
Fortunately, this type is internal only and doesn’t need to be exported.
lumberjack_go
is the worker function that takes all possible trees out of a
heap. For every 1 in the shape of the heap, we have a tree: therefore it
should not be a surprise that the length of the resulting vector is
Popcount b
.
lumberjack_go :: forall k b a.
SNat k
> Forest k b a
> Vec (Popcount b) (CutTree k b a)
The definition is recursive and a good example of how recursion corresponds with
inductive proofs (we’re using lemma1
and lemma2
here). We don’t go into
much detail with our explanation here – this code is often hard to write, but
surprisingly easy to read.
FEnd = VNull
lumberjack_go _ F0 forest0) = fmap
lumberjack_go nnat0 (> case cutTree of
(\cutTree CutTree xnat (SSucc nnat) t1 forest1 > CutTree
SSucc xnat)
(
nnatcase lemma2 nnat xnat of QED > t1)
(F0 forest1))
(SSucc nnat0) forest0)
(lumberjack_go (F1 tree0 forest0) = VCons
lumberjack_go nnat0 (CutTree
(SZero
nnat0case lemma1 nnat0 of QED > tree0)
(F0 forest0))
(fmap
(> case cutTree of
(\cutTree CutTree xnat (SSucc nnat) t1 forest1 > CutTree
SSucc xnat)
(
nnatcase lemma2 nnat xnat of QED > t1)
(F1 tree0 forest1))
(SSucc nnat0) forest0)) (lumberjack_go (
Now that we can select Popcount b
trees, it’s time to convert this to
something more convenient to work with. We will use a NonEmpty
to represent
our list of candidates to select from.
lumberjack :: forall b a. BNonZero b ~ 'True
=> Forest 'Zero b a
> NonEmpty.NonEmpty (CutTree 'Zero b a)
First, we select the Popcount b
trees:
=
lumberjack trees let cutTrees :: Vec (Popcount b) (CutTree 'Zero b a)
= lumberjack_go SZero trees in cutTrees
Then we convert it to a NonEmpty
. This requires us to call lemma3
(the
proof that relates nonzeroness of a binary number with nonzeroness of a
natural number through popcount). We need an appropriate SBin
to call
lemma3
and the auxiliary function forestSingleton
defined just below does
that for us.
case lemma3 (forestSingleton trees :: SBin b) of
QED > vecToNonEmpty cutTrees
This function is similar to childrenSingleton
– it constructs an appropriate
singleton we can use in proofs.
forestSingleton :: Forest k b a > SBin b
FEnd = SBEnd
forestSingleton F0 t) = SB0 (forestSingleton t)
forestSingleton (F1 _ t) = SB1 (forestSingleton t) forestSingleton (
We can now find all trees in the heap that may be cut. They are returned in a
CutTree
datatype. If we assume that we are taking a specific CutTree
, we
can take the root from the tree inside this datatype, and we can construct a new
heap from its children using childrenToForest
. Then, we merge it back
together with the original heap.
The new heap has one less element – hence we use BDec
(binary decrement,
defined just a bit below).
popForest :: forall a b. Ord a
=> CutTree 'Zero b a
> (a, Forest 'Zero (BDec b) a)
We deconstruct the CutTree
to get the root (x
) of the selected tree,
the children of the selected trees (children
), and the remaining trees in the
heap (forest
).
CutTree
popForest (
_xnat _nnatTree x (children :: Children r a))
(forest :: Forest 'Zero l a)) = (
We construct a new forest from the children.
let cforest = childrenToForest children
We merge it with the remainder of the heap:
merged :: Forest 'Zero (BAdd l (Ones r)) a
= mergeForests forest cforest merged
The illustration from above applies here:
Now, we cast it to the result using a new lemma4
with a singleton that we
construct from the trees:
evidence :: SBin (BAdd l (Ones r))
= forestSingleton merged in
evidence case lemma4 evidence of QED > merged) (x,
This is the type family for binary decrement. It is partial, as expected – you
cannot decrement zero. This is a bit unfortunate but necessary. Having the
BNonZero
type family and using it as a constraint will solve that though.
type family BDec (binary :: Binary) :: Binary where
BDec ('B1 b) = 'B0 b
BDec ('B0 b) = 'B1 (BDec b)
The weirdly specific lemma4
helps us prove that we can take a number,
increment it and then decrement it, and then get the same number back provided
incrementing doesn’t change its width. This ends up matching perfectly with the
width constraint generated by the CutTree
, where the number that we increment
is a number of “1”s smaller than the shape of the total heap (intuitively).
Using another constraint in CutTree
with another proof here should also be
possible. I found it hard to reason about why this constraint is necessary,
but once I understood that it wasn’t too abnormal. The proof is easy though.
lemma4 :: (Width x ~ Width (BInc x))
=> SBin x
> BDec (BInc x) :~: x
SB0 _) = QED
lemma4 (SB1 b) = case lemma4 b of QED > QED lemma4 (
We don’t need to define a clause for SBEnd
since
Width SBEnd ~ Width (BInc SBEnd)
does not hold.
Tying all of this together makes for a relatively easy readable popHeap
:
popHeap :: (BNonZero b ~ 'True, Ord a)
=> Heap b a > (a, Heap (BDec b) a)
Heap forest0) = popHeap (
Out of the different candidates, select the one with the minimal root
(minimumBy
is total on NonEmpty
):
let cutTrees = lumberjack forest0
= minimumBy (comparing cutTreeRoot) cutTrees in selected
Pop that tree using popForest
:
case popForest selected of
> (x, Heap forest1) (x, forest1)
Helper to compare candidates by root:
where
cutTreeRoot :: CutTree k b a > a
CutTree _ _ (Tree x _) _) = x cutTreeRoot (
In GHCi:
*Main> let heap = pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
pushHeap 'd' $ pushHeap 'e' emptyHeap
*Main> :t heap
heap :: Heap ('B1 ('B0 ('B1 'BEnd))) Char
*Main> :t popHeap heap
popHeap heap :: (Char, Heap ('B0 ('B0 ('B1 'BEnd))) Char)
*Main> fst $ popHeap heap
'a'
*Main> snd $ popHeap heap
(no tree of order 0)
(no tree of order 1)
(tree of order 2)
'b'
'd'
'e'
'c'
Beautiful! Our final interface to deal with the heap looks like this:
emptyHeap :: Heap ('B0 'BEnd) a
pushHeap :: Ord a
=> a > Heap b a > Heap (BInc b) a
mergeHeap :: Ord a
=> Heap lb a > Heap rb a > Heap (BAdd lb rb) a
popHeap :: (BNonZero b ~ 'True, Ord a)
=> Heap b a > (a, Heap (BDec b) a)
I would like to thank Alex Lang for many discussions about this and for proofreading, Akio Takano and Fumiaki Kinoshita for some whiteboarding, and Titouan Vervack and Becki Lee for many additional corrections.
I am by no means an expert in dependent types so while GHC can guarantee that my logic is sound, I cannot guarantee that my code is the most elegant or that my explanations are waterproof. In particular, I am a bit worried about the fact that binary numbers do not have unique representations – even though it does seem to make the code a bit simpler. If you have any ideas for improvements, however, feel free to reach out!
Update: Lars Brünjes contacted me and showed me a similar implementation he did for leftist heaps. You can see it in this repository. He uses a similar but unique representation of binary numbers, along the lines of:
data Binary = Zero  StrictlyPositive Positive
data Positive = B1End  B0 Positive  B1 Positive
I think this is actually more elegant than the representation I used. The only
disadvantage is that is a bit less concise (which is somewhat relevant for a
blogpost), requiring two functions and two datatypes for most cases (e.g. a
Forest k Binary
and a PForest k Positive
, with mergeForests
and
mergePForests
, and so on). But if you wanted to use this idea in a real
implementation, I encourage you to check that out.
Since we represent the proofs at runtime, we incur an overhead in two ways:
QED
constructor.It should be possible to remove these at runtime once the code has been typechecked, possibly using some sort of GHC core or source plugin (or CPP in a darker universe).
Another existing issue is that the tree of the spine is never “cleaned up”. We
never remove trailing F0
constructors. This means that if you fill a heap of
eight elements and remove all of them again, you will end up with a heap with
zero elements that has the shape 'B0 ('B0 ('B0 ('B0 'BEnd)))
rather than B0 'BEnd
. However, this sufficed for my use case. It should be possible to add
and prove a cleanup step, but it’s a bit outside the scope of this blogpost.
instance forall a b. Show a => Show (Heap b a) where
show = intercalate "\n" . goTrees 0 . unHeap
where
goTrees :: forall m c. Show a => Int > Forest m c a > [String]
FEnd = []
goTrees _ F0 trees) =
goTrees order ("(no tree of order " ++ show order ++ ")") :
(+ 1) trees
goTrees (order F1 tree trees) =
goTrees order ("(tree of order " ++ show order ++ ")") :
(" " tree ++
goTree + 1) trees
goTrees (order
goTree :: forall m. String > Tree m a > [String]
Tree x children) =
goTree indentation (++ show x) :
(indentation ' ' : indentation) children
goChildren (
goChildren :: forall m. String > Children m a > [String]
CZero = []
goChildren _ CCons x xs) =
goChildren indentation (++ goChildren indentation xs goTree indentation x
Increment gets tricky mainly because we need some way to communicate the carry
back in a righttoleft direction. We can do this with a typelevel Either
and some utility functions. It’s not too far from what we would write on a
termlevel, but again, a bit more clunky. We avoid this kind of clunkiness
since having significantly more code obviously requires significantly more
proving.
type family BIncLTR (b :: Binary) :: Binary where
BIncLTR b = FromRight 'B1 (Carry b)
type family Carry (b :: Binary) :: Either Binary Binary where
Carry ('B1 'BEnd) = 'Left ('B0 'BEnd)
Carry ('B0 'BEnd) = 'Right ('B1 'BEnd)
Carry ('B0 b) = 'Right (UnEither 'B1 'B0 (Carry b))
Carry ('B1 b) = MapEither 'B0 'B1 (Carry b)
type family MapEither
f :: a > c) (g :: b > d) (e :: Either a b) :: Either c d where
(MapEither f _ ('Left x) = 'Left (f x)
MapEither _ g ('Right y) = 'Right (g y)
type family UnEither
f :: a > c) (g :: b > c) (e :: Either a b) :: c where
(UnEither f _ ('Left x) = f x
UnEither _ g ('Right y) = g y
type family FromRight (f :: a > b) (e :: Either a b) :: b where
FromRight f ('Left x) = f x
FromRight _ ('Right y) = y
For work, I recently put together an interpreter for a lambda calculus that was way faster than I expected it to be – around 30 times as fast. I suspected this meant that something was broken, so in order to convince myself of its correctness, I wrote a welltyped version of it in the style of Francesco’s welltyped suspension calculus blogpost. It used a standard lengthindexed list which had the unfortunate side effect of pushing me into O(n) territory for random access. I started looking for an asymptotically faster way to do this, which is how I ended up looking at heaps. In this blogpost, I am using the binomial heap as a priority queue rather than a bastardized random access skip list since that is what readers are presumably more familiar with.↩︎
For reasons that will become clear later on, the binary numbers that pop up on the type level should be read righttoleft. A palindrome was chosen as example here to avoid having to explain that at this point.↩︎
This type and related utilities are found in Data.Type.Equality, but redefined here for educational purposes.↩︎
The datatype in Data.Type.Equality
allows equality between
heterogeneous kinds as well, but we don’t need that here. This saves us from
having to toggle on the “scary” {# LANGUAGE TypeInType #}
.↩︎
I’m not sure if it is actually impossible to use this simpler type, but I did not succeed in finding a proof that uses this simpler type.↩︎