jaspervdj - All posts

Lazy Layout

Jasper Van der Jeugt — Sat, 22 Jul 2023 00:00:00 UT

Prelude

This blogpost is written in reproducible Literate Haskell, so we need some imports first.

Show me the exact imports…

Haskell Puzzles

Jasper Van der Jeugt — Mon, 19 Jun 2023 00:00:00 UT

At ZuriHac 2023, I worked on some Haskell Puzzles together with Alex, Francesco, Jussi and Patrick. You are given some Haskell tokens and a goal term, and you need to rearrange the tokens into an expression that produces the goal.

Here a simple warmup exercise just to get an idea of how it works:

Let’s continue with iterate but do a real puzzle next. What Monad are we looking for?

How is e defined again?

let allows us to reuse code. This is very useful if you don’t have enough tokens to make a sensible program.

Here is the final puzzle – but how can we produce a string from a bunch of numbers?

Haskell evaluation powered by tryhaskell.org. UI powered by some messy JavaScript.

We playtested these puzzles with simple pieces of paper during the event. At the final presentation we played a web-based multiplayer version where each player controls only one token.

We actually found the single player to be more fun, and since we already had some client code I decided to clean it up a bit and make a single player version available here.

Thanks for playing!

Lazy Sort: Counting Comparisons

Jasper Van der Jeugt — Thu, 17 Sep 2020 00:00:00 UT

Introduction

Haskell’s laziness allows you to do many cool things. I’ve talked about searching an infinite graph before. Another commonly mentioned example is finding the smallest N items in a list.

Because programmers are lazy as well, this is often defined as:

smallestN_lazy :: Ord a => Int -> [a] -> [a]
smallestN_lazy n = take n . sort

This happens regardless of the language of choice if we’re confident that the list will not be too large. It’s more important to be correct than it is to be fast.

However, in strict languages we’re really sorting the entire list before taking the first N items. We can implement this in Haskell by forcing the length of the sorted list.

smallestN_strict :: Ord a => Int -> [a] -> [a]
smallestN_strict n l0 = let l1 = sort l0 in length l1 `seq` take n l1

If you’re at least somewhat familiar with the concept of laziness, you may intuitively realize that the lazy version of smallestN is much better since it’ll only sort as far as it needs.

But how much better does it actually do, with Haskell’s default sort?

A better algorithm?

For the sake of the comparison, we can introduce a third algorithm, which does a slightly smarter thing by keeping a heap of the smallest elements it has seen so far. This code is far more complex than smallestN_lazy, so if it performs better, we should still ask ourselves if the additional complexity is worth it.

smallestN_smart :: Ord a => Int -> [a] -> [a]
smallestN_smart maxSize list = do
    (item, n) <- Map.toList heap
    replicate n item
  where
    -- A heap is a map of the item to how many times it occurs in
    -- the heap, like a frequency counter.  We also keep the current
    -- total count of the heap.
    heap = fst $ foldl' (\acc x -> insert x acc) (Map.empty, 0) list
    insert x (heap0, count)
        | count < maxSize = (Map.insertWith (+) x 1 heap0, count + 1)
        | otherwise = case Map.maxViewWithKey heap0 of
            Nothing -> (Map.insertWith (+) x 1 heap0, count + 1)
            Just ((y, yn), _) -> case compare x y of
                EQ -> (heap0, count)
                GT -> (heap0, count)
                LT ->
                    let heap1 = Map.insertWith (+) x 1 heap0 in
                    if yn > 1
                        then (Map.insert y (yn - 1) heap1, count)
                        else (Map.delete y heap1, count)

So, we get to the main trick I wanted to talk about: how do we benchmark this, and can we add unit tests to confirm these benchmark results in CI? Benchmark execution times are very fickle. Instruction counting is awesome but perhaps a little overkill.

Instead, we can just count the number of comparisons.

Counting comparisons

We can use a new type that holds a value and a number of ticks. We can increase the number of ticks, and also read the ticks that have occurred.

data Ticks a = Ticks {ref :: !(IORef Int), unTicks :: !a}

mkTicks :: a -> IO (Ticks a)
mkTicks x = Ticks <$> IORef.newIORef 0 <*> pure x

tick :: Ticks a -> IO ()
tick t = IORef.atomicModifyIORef' (ref t) $ \i -> (i + 1, ())

ticks :: Ticks a -> IO Int
ticks = IORef.readIORef . ref

smallestN has an Ord constraint, so if we want to count the number of comparisons we’ll want to do that for both == and compare.

instance Eq a => Eq (Ticks a) where
    (==) = tick2 (==)

instance Ord a => Ord (Ticks a) where
    compare = tick2 compare

The actual ticking code goes in tick2, which applies a binary operation and increases the counters of both arguments. We need unsafePerformIO for that but it’s fine since this lives only in our testing code and not our actual smallestN implementation.

tick2 :: (a -> a -> b) -> Ticks a -> Ticks a -> b
tick2 f t1 t2 = unsafePerformIO $ do
    tick t1
    tick t2
    pure $ f (unTicks t1) (unTicks t2)
{-# NOINLINE tick2 #-}

Results

Let’s add some benchmarking that prints an ad-hoc CSV:

main :: IO ()
main = do
    let listSize = 100000
        impls = [smallestN_strict, smallestN_lazy, smallestN_smart]
    forM_ [50, 100 .. 2000] $ \sampleSize -> do
        l <- replicateM listSize randomIO :: IO [Int]
        (nticks, results) <- fmap unzip $ forM impls $ \f -> do
            l1 <- traverse mkTicks l
            let !r1 = sum . map unTicks $ f sampleSize l1
            t1 <- sum <$> traverse ticks l1
            pure (t1, r1)
        unless (equal results) . fail $
            "Different results: " ++ show results
        putStrLn . intercalate "," . map show $ sampleSize : nticks

Plug that CSV into a spreadsheet and we get this graph. What conclusions can we draw?

Clearly, both the lazy version as well as the “smart” version are able to avoid a large number of comparisons. Let’s remove the strict version so we can zoom in.

What does this mean?

If the sampleSize is small, the heap implementation does less comparions. This makes sense: even if treat sort as a black box, and don’t look at it’s implementation, we can assume that it is not optimally lazy; so it will always sort “a bit too much”.
As sampleSize gets bigger, the insertion into the bigger and bigger heap starts to matter more and more and eventually the naive lazy implementation is faster!
Laziness is awesome and take N . sort is absolutely the first implementation you should write, even if you replace it with a more efficient version later.
Code where you count a number of calls is very easy to do in a test suite. It doesn’t pollute the application code if we can patch in counting through a typeclass (Ord in this case).

Can we say something about the complexity?

The complexity of smallestN_smart is basically inserting into a heap listSize times. This gives us O(listSize * log(sampleSize)).

That is of course the worst case complexity, which only occurs in the special case where we need to insert into the heap at each step. That’s only true when the list is sorted, so for a random list the average complexity will be a lot better.
The complexity of smallestN_lazy is far harder to reason about. Intuitively, and with the information that Data.List.sort is a merge sort, I came to something like O(listSize * max(sampleSize, log(listSize))). I’m not sure if this is correct, and the case with a random list seems to be faster.

I would be very interested in knowing the actual complexity of the lazy version, so if you have any insights, be sure to let me know!

Update: Edward Kmett corrected me: the complexity of smallestN_lazy is actually O(listSize * min(sampleSize, listSize)), with O(listSize * min(sampleSize, log(listSize)) in expectation for a random list.

Thanks to Huw Campbell for pointing out a bug in the implementation of smallestN_smart – this is now fixed in the code above.

Appendix

Helper function: check if all elements in a list are equal.

equal :: Eq a => [a] -> Bool
equal (x : y : zs) = x == y && equal (y : zs)
equal _            = True

Photoessay: Pilatus

Jasper Van der Jeugt — Wed, 19 Aug 2020 00:00:00 UT

Now that we’re in a global pandemic, I’ve been doing significantly more hiking in Switzerland. Around a month ago we climbed the Pilatus mountain near Lucerne.

Unfortunately, there was a thick layer of clouds so we didn’t get the amazing panoramic views from the top. On the flip side, this really works well with black-and-white photography: the clouds add a lot of drama and character. This is a selection of six photographs.

This is the view from where we got of the gondola, looking towards the top. As you can see, it’s possible to take gondolas all the way up as well.

I did the climb together with three friends. We actually went there with a larger group but split up near the start as some people wanted to do a longer hike and decided to get off the gondola earlier.

The first part was mostly easy and led us through some woods.

After that the ascent was very steep. We took a lunch break near the Klimsenkapelle chapel, visible in the background.

This is another view looking up towards the top: after the lunch break, there was only one part of the climb remaining, but it looked very intimidating.

This final photograph is looking back to the Klimsenkapelle. It was taken shortly before we made it to the peak.

Visual Arrow Syntax

Jasper Van der Jeugt — Thu, 12 Mar 2020 00:00:00 UT

Not to be taken seriously.

Haskell is great building at DSLs – which are perhaps the ultimate form of slacking off at work. Rather than actually doing the work your manager tells you to, you can build DSLs to delegate this back to your manager so you can focus on finally writing up that GHC proposal for MultilinePostfixTypeOperators (which could have come in useful for this blogpost).

So, we’ll build a visual DSL that’s so simple even your manager can use it! This blogpost is a literate Haskell file so you can run it directly in GHCi. Note that some code is located in a second module because of compilation stage restrictions.

Let’s get started. We’ll need a few language extensions – not too much, just enough to guarantee job security for the forseeable future.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
module Visual where

And then some imports, not much going on here.

import qualified Codec.Picture as JP
import qualified Codec.Picture.Types as JP
import Control.Arrow
import Control.Category
import Control.Monad.ST (runST)
import Data.Char (isUpper)
import Data.Foldable (for_)
import Data.List (sort, partition)
import qualified Language.Haskell.TH as TH
import Prelude hiding (id, (.))

All Haskell tutorials that use some form of dependent typing seem to start with the HList type. So I suppose we’ll do that as well.

data HList (things :: [*]) where
  Nil  :: HList '[]
  Cons :: x -> HList xs -> HList (x ': xs)

I think HList is short for hype list. There’s a lot of hype around this because it allows you to put even more types in your types.

We’ll require two auxiliary functions for our hype list. Because of all the hype, they each require a type family in order for us to even express their types. The first one just takes the last element from a list.

hlast :: HList (thing ': things) -> Last (thing ': things)
hlast (Cons x Nil)         = x
hlast (Cons _ (Cons y zs)) = hlast (Cons y zs)

type family Last (l :: [*]) :: * where
  Last (x ': '[]) = x
  Last (x ': xs)  = Last xs

Readers may wonder if this is safe, since last is usually a partial function. Well, it turns out that partial functions are safe if you type them using partial type families. So one takeaway is that partial functions can just be fixed by adding more partial stuff on top. This explains things like Prelude.

Anyway, the second auxiliary function drops the last element from a list.

hinit :: HList (thing ': things) -> HList (Init (thing ': things))
hinit (Cons _ Nil)         = Nil
hinit (Cons x (Cons y zs)) = Cons x (hinit (Cons y zs))

type family Init (l :: [*]) :: [*] where
  Init (_ ': '[])     = '[]
  Init (x ': y ': zs) = x ': Init (y ': zs)

And that’s enough boilerplate! Let’s get right to it.

It’s always good to pretend that your DSL is built on solid foundations. As I alluded to in the title, we’ll pick Arrows. One reason for that is that they’re easier to explain to your manager than Applicative (stuff goes in, other stuff comes out, see? They’re like the coffee machine in the hallway). Secondly, they are less powerful than Monads and we prefer to keep that good stuff to ourselves.

Unfortunately, it seems like the Arrow module was contributed by an operator fetishism cult, and anyone who’s ever done non-trivial work with Arrows now has a weekly therapy session to talk about how &&& and *** hurt them.

This is not syntax we want anyone to use. Instead, we’ll, erm, slightly bend Haskell’s syntax to get something that is “much nicer” and “definitely not an abomination”.

We’ll build something that appeals to both Category Theorists (for street cred) and Corporate Managers (for our bonus). These two groups have many things in common. Apart from talking a lot about abstract nonsense and getting paid for it, both love drawing boxes and arrows.

Yeah, so I guess we can call this visual DSL a Diagram. The main drawback of arrows is that they can only have a single input and output. This leads to a lot of tuple abuse.

We’ll “fix” that by having extra ins and outs. We are wrapping an arbitrary Arrow, referred to as f in the signature:

data Diagram (ins :: [*]) (outs :: [*]) f a b where

We can create a diagram from a normal arrow, that’s easy.

  Diagram :: f a b -> Diagram '[] '[] f a b

And we can add another normal function at the back. No biggie.

  Then
    :: Diagram ins outs f a b -> f b c
    -> Diagram ins outs f a c

Of course, we need to be able to use our extra input and outputs. Output wraps an existing Diagram and redirects the second element of a tuple to the outs; and Input does it the other way around.

  Output
    :: Diagram ins outs f a (b, o)
    -> Diagram ins (o ': outs) f a b

  Input
    :: Diagram ins outs f a b
    -> Diagram (i ': ins) outs f a (b, i)

The hardest part is connecting two existing diagrams. This is really where the magic happens:

  Below
    :: Diagram ins1 outs1 f a b
    -> Diagram (Init (b ': outs1)) outs2 f (Last (b ': outs1)) c
    -> Diagram ins1 outs2 f a c

Is this correct? What does it even mean? The answer to both questions is: “I don’t know”. It typechecks, which is what really matters when you’re doing Haskell. And there’s something about ins matching outs in there, yeah.

Concerned readers of this blog may at this point be wondering why we used reasonable names for the constructors of Diagram rather than just operators.

Well, it’s only because it’s a GADT which makes this impossible. But fear not, we can claim our operators back. Shout out to Unicode’s Box-drawing characters: they provide various charaters with thick and thin lines. This lets us do an, uhm, super intuitive syntax where tuples are taken apart as extra inputs/outputs, or reified back into tuples.

(━►)   = Then
l ┭► r = Output l ━► r
l ┳► r = (l ━► arr (\x -> (x, x))) ┭► r
l ┶► r = Input l ━► r
l ╆► r = Output (Input l ━► arr (\x -> (x, x))) ━► r
l ┳ c  = l ┳► arr (const c)
l ┓ r  = Below l r
l ┧ r  = Input l ┓ r
l ┃ r  = Input l ━► arr snd ┓ r
infixl 5 ━►, ┳►, ┭►, ┶►, ╆►, ┳
infixr 4 ┓, ┧, ┃

Finally, while we’re at it, we’ll also include an operator to clearly indicate to our manager how our valuation will change if we adopt this DSL.

(📈) = Diagram

This lets us do the basics. If we start from regular Arrow syntax:

horribleExample01 =
  partition isUpper >>> reverse *** sort >>> uncurry mappend

We can now turn this into:

amazingExample01 =
 (📈) (partition isUpper)┭►reverse┓
 (📈)                   sort      ┶►(uncurry mappend)

The trick to decrypting these diagrams is that each line in the source code consists of an arrow where values flow from the left to the right; with possible extra inputs and ouputs in between. These lines are then composed using a few operators that use Below such as ┓ and ┧.

To improve readability even further, it should also be possible to add right-to-left and top-to-bottom operators. I asked my manager if they wanted these extra operators but they’ve been ignoring all my Slack messages since I showed them my original prototype. Probably just busy?

Anyway, there are other simple improvements we can make to the visual DSL first. Most Haskellers prefer nicely aligning things over producing working code, so it would be nice if we could draw longer lines like ━━━━┳━► rather than just ┳►. And any Haskeller worth their salt will tell you that this is where Template Haskell comes in.

Template Haskell gets a bad rep, but that’s only because it is mostly misused. Originally, it was designed to avoid copying and pasting a lot of code, which is exactly what we’ll do here. Nothing to be grossed out about.

extensions :: Maybe Char -> String -> Maybe Char -> [String]
extensions mbLeft operator mbRight =
  [operator] >>= maybe pure goR mbRight >>= maybe pure goL mbLeft
 where
  goL l op = [replicate n l ++ op | n <- [1 .. 19]]
  goR r op = [init op ++ replicate n r ++ [last op] | n <- [1 .. 19]]

industryStandardBoilerplate
  :: Maybe Char -> TH.Name -> Maybe Char -> TH.Q [TH.Dec]
industryStandardBoilerplate l name r = do
  sig <- TH.reify name >>= \case
    TH.VarI _ sig _ -> pure sig
    _               -> fail "no info"
  fixity <- TH.reifyFixity name >>= maybe (fail "no fixity") pure
  pure
    [ decl
    | name' <- fmap TH.mkName $ extensions l (TH.nameBase name) r
    , decl  <-
        [ TH.SigD name' sig
        , TH.FunD name' [TH.Clause [] (TH.NormalB (TH.VarE name)) []]
        , TH.InfixD fixity name'
        ]
    ]

We can then invoke this industry standard boilerplate to extend and copy/paste an operator like this:

$(industryStandardBoilerplate (Just '━') '(┭►) (Just '─'))

We’re now equipped to silence even the harshest syntax critics:

example02 =
  (📈) (partition isUpper)━┭─►(reverse)━┓
  (📈)                   (sort)─────────┶━►(uncurry mappend)

Beautiful! If you’ve ever wondered what people mean when they say functional programs “compose elegantly”, well, this is what they mean.

example03 =
  (📈) (+1)━┳━►(+1)━┓
  (📈)      (+1)━━━━╆━►add━┓
  (📈)              add────┶━►add
 where
  add = uncurry (+)

Type inference is excellent and running is easy. In GHCi:

*Main> :t example03
example04 :: Diagram '[] '[] (->) Integer Integer
*Main> run example03 1
12

Let’s look at a more complicated example.

lambda =
  (📈)  (id)━┭─►(subtract 0.5)━┳━━━━━━━━►(< 0)━━━━━━━━━━┓
  (📈)    (subtract 0.5)───────╆━►(add)━►(abs)━►(< 0.1)─┶━━━━━━━►(and)━━━━━━━┓
  (📈)                      (swap)━┭─►(* pi)━━►(sin)┳()                      ┃
  (📈)                           (* 2)──────────────┶━►(sub)━►(abs)━►(< 0.2)─┧
  (📈)                                                                      (or)━►(bool bg fg)
 where
  add = uncurry (+)
  sub = uncurry (-)
  and = uncurry (&&)
  or  = uncurry (||)
  fg  = JP.PixelRGB8 69  58  98
  bg  = JP.PixelRGB8 255 255 255

This renders everyone’s favorite greek letter:

Amazing! Math!

While the example diagrams in this post all use the pure function arrow ->, it is my duty as a Haskeller to note that it is really parametric in f or something. What this means is that thanks to this famous guy called Kleisli, you can immediately start using this with IO in production. Thanks for reading!

Update: CarlHedgren pointed out to me that a similar DSL is provided by Control.Arrow.Needle. However, that package uses Template Haskell to just parse the diagram. In this blogpost, the point of the exercise is to bend Haskell’s syntax and type system to achieve the notation.

Appendix 1: run implementation

The implementation of run uses a helper function that lets us convert a diagram back to a normal Arrow that uses HList to pass extra inputs and outputs:

fromDiagram
  :: Arrow f => Diagram ins outs f a b
  -> f (a, HList ins) (b, HList outs)

We can then have a specialized version for when there’s zero extra inputs and outputs. This great simplifies the type signatures and gives us a “normal” f a b:

run :: Arrow f => Diagram '[] '[] f a b -> f a b
run d = id &&& (arr (const Nil)) >>> fromDiagram d >>> arr fst

The definition for fromDiagram is as follows:

fromDiagram (Diagram f) = f *** arr (const Nil)
fromDiagram (Then l r) = fromDiagram l >>> first r
fromDiagram (Output l) =
  fromDiagram l >>> arr (\((x, y), things) -> (x, Cons y things))
fromDiagram (Input l) =
  arr (\(x, Cons a things) -> ((x, things), a)) >>>
  first (fromDiagram l) >>>
  arr (\((y, outs), a) -> ((y, a), outs))
fromDiagram (Below l r) =
  fromDiagram l >>>
  arr (\(x, outs) -> (hlast (Cons x outs), hinit (Cons x outs))) >>>
  fromDiagram r

Appendix 2: some type signatures

We wouldn’t want these to get in our way in the middle of the prose, but GHC complains if we don’t put them somewhere.

(┳►) :: Arrow f => Diagram ins outs f a b -> f b c
     -> Diagram ins (b ': outs) f a c
(┭►) :: Arrow f => Diagram ins outs f a (b, o) -> f b c
     -> Diagram ins (o ': outs) f a c
(┶►) :: Diagram ins outs f a b -> f (b, i) c
     -> Diagram (i ': ins) outs f a c
(╆►) :: Arrow f => Diagram ins outs f a b -> f (b, u) c
     -> Diagram (u ': ins) ((b, u) ': outs) f a c
(┧)  :: Diagram ins1 outs1 f a b
     -> Diagram (Init ((b, u) ': outs1)) outs2 f (Last ((b, u) ': outs1)) c
     -> Diagram (u ': ins1) outs2 f a c

Appendix 3: image rendering boilerplate

This uses a user-supplied Diagram to render an image.

image
  :: Int -> Int
  -> Diagram '[] '[] (->) (Double, Double) JP.PixelRGB8
  -> JP.Image JP.PixelRGB8
image w h diagram = runST $ do
  img <- JP.newMutableImage w h
  for_ [0 .. h - 1] $ \y ->
    for_ [0 .. w - 1] $ \x ->
      let x' = fromIntegral x / fromIntegral (w - 1)
          y' = fromIntegral y / fromIntegral (h - 1) in
      JP.writePixel img x y $ run diagram (x', y')
  JP.freezeImage img

Zero-config MiniDLNA/ReadyMedia

Jasper Van der Jeugt — Wed, 26 Feb 2020 00:00:00 UT

TL;DR:

You can use ReadyMedia without configuring it as a daemon. Just cd into any directory that has media files, and run this script:

#!/bin/bash
set -o nounset -o errexit -o pipefail

# Create temporary locations for the configuration and data directories.
CONFIG="$(mktemp)"
DATADIR="$(mktemp -d)"

# Write the configuration to the temporary location.
echo "media_dir=$PWD" >>"$CONFIG"
echo "db_dir=$DATADIR" >>"$CONFIG"
echo "log_dir=$DATADIR" >>"$CONFIG"
echo 'force_sort_criteria=+upnp:class,+upnp:originalTrackNumber,+dc:title' >>"$CONFIG"
cat $CONFIG

# Make sure everything is cleaned up when this process is killed.
function cleanup {
  rm -r "$DATADIR"
  rm "$CONFIG"
}
trap cleanup exit

# Run minidlnad with the following flags:
#
#  -  `-f "$CONFIG"`: use the configuration we wrote.
#  -  `-f "$PWD/minidlnad.pid"`: store the `.pid` in the current directory.
#  -  `-d`: don't daemonize, we'll kill this when we're done.  This also
#     enabled "debug" mode; but I haven't seen any considerable slowdown
#     from this.
minidlnad -f "$CONFIG" -P "$PWD/minidlnad.pid" -d

Your television/phone/toaster should see the media server pop up within seconds.

Motivation

In this day and age, there are literally thousands of ways to get video on to your television screen, especially if you have a (somewhat) smart TV. If not, there is a plethora of devices that will let you stream from different sources.

For simply watching video files on my local disk, I used to just hook my laptop up to the television using a simple HDMI cable, which always worked – until the HDMI port on my television broke.

I don’t really want to get any of these devices, and I’m also not sure if I need a newer television that phones home.

In either case, most televisions that support any kind of networking will also support the DLNA protocol. For Linux, there’s ReadyMedia (formerly MiniDLNA), a relatively old project. But despite lacking some maintenance, it is pretty solid and reliable software.

By default, it runs as a daemon that stores a database of media. This makes it very cumbersome to use. The database gets out of sync easily when you move files around when it’s not running. The fact that it’s a daemon means that it could be running when you’re working from a coffee place. The daemon needs to be managed through a file in /etc/.

I don’t want to go through all that pain! I just want to be able to fire it up like you can get a quick HTTP server with just running python -m http.server in any directory. Then I can cd to whatever I want to watch and just run the thing and then kill it. I don’t care about keeping this media database, since scanning a single directory should be quick.

Well, it turns out you can do that fairly easily. Just drop the script I linked to at the top of this post in to your $PATH and you’re good to go.

Mandelbrot & Lovejoy's Rain Fractals

Jasper Van der Jeugt — Sat, 04 Jan 2020 00:00:00 UT

Summary

At some point during ICFP2019 in Berlin, I came across a completely unrelated old paper by S. Lovejoy and B. B. Mandelbrot called “Fractal properties of rain, and a fractal model”.

While the model in the paper is primarily meant to model rainfall; the authors explain that it can also be used for rainclouds, since these two phenomena are naturally similarly-shaped. This means it can be used to generate pretty pictures!

While it looked cool at first, it turned out to be an extremely pointless and outdated way to generate pictures like this. But I wanted to write it up anyway since it is important to document failure as well as success: if you’ve found this blogpost searching for an implementation of this paper; well, you have found it, but it probably won’t help you. Here is the GitHub repository.

The good parts

I found this paper very intriguing because it promises a fractal model with a number of very attractive features:

is extremely simple
has easy to understand parameters
is truly self-similar at different scales
it has great lacunarity (I must admit I didn’t know this word before going through this paper)

Most excitingly, it’s possible to do a dimension-generic implementation! The code has examples in 2D as well as 3D (xy, time), but can be used without modifications for 4D (xyz, time) and beyond. Haskell’s type system allows capturing the dimension in a type parameter so we don’t need to sacrifice any type safety in the process.

For example, here the dimension-generic distance function I used with massiv:

distance :: M.Index ix => ix -> ix -> Distance
distance i j = Distance . sqrt .
    fromIntegral .  M.foldlIndex (+) 0 $
    M.liftIndex2 (\p s -> (p - s) * (p - s)) i j

Here is a 3D version:

The (really) bad parts

However, there must be a catch, right? If it has all these amazing properties, why is nobody using it? I didn’t see any existing implementations; and even though I had a very strong suspicion as to why that was the case, I set out to implement it during Munihac 2019.

As I was working on it, the answer quickly became apparent – the algorithm is so slow that its speed cannot even be considered a trade-off, its slowness really cancels out all advantages and then some! BitCoin may even be a better use of compute resources. The 30 second video clip I embedded earlier took 8 hours to render on a 16-core machine.

This was a bit of a bummer on two fronts: the second one being that I wanted to use this as a vehicle to learn some GPU programming; and it turned out to be a bad fit for GPU programming as well.

At a very high-level, the algorithm repeats the following steps many, many times:

At random, pick a position in (or near) the image.
Pick a size for your circular shape in a way that the probability of the size being larger than P is P⁻¹.
Draw the circular shape onto the image.

This sounds great for GPU programming; we could generate a large number of images and then just sum them together. However, the probability distribution from step 2 is problematic. Small (≤3x3) shapes are so common that it seems faster use a CPU (or, you know, 16 CPUs) and just draw that specific region onto a single image.

The paper proposes 3 shapes (which it calls “pulses”). It starts out with just drawing plain opaque circles with a hard edge. This causes some interesting but generally bad-looking edges:

It then switches to using circles with smoothed edges; which looks much better, we’re getting properly puffy clouds here:

Finally, the paper discusses drawing smoothed-out annuli, which dramatically changes the shapes of the clouds:

It’s mildly interesting that the annuli become hollow spheres in 3D.

Thanks to Alexey for massiv and a massive list of suggestions on my implementation!

Partial application using flip

Jasper Van der Jeugt — Tue, 15 Oct 2019 00:00:00 UT

I have been writing Haskell for a reasonable time now – I believe I am coming up on ten years – so sadly the frequency with which I discover delightful things about the language has decreased.

However, I was talking with HVR about the Handle pattern, and the topic of argument order came up. This lead me to a neat use case for flip that I hadn’t seen before.

This blogpost should be approachable for beginners, but when you’re completely new to Haskell and some terms are confusing, I would recommend looking at the Type Classes or Learn You a Haskell materials first.

A few extensions are required to show some intermediary results, but – spoiler alert – they turn out to be unnecessary in the end:

{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances     #-}
{-# LANGUAGE FlexibleContexts      #-}

Currying and partial application

In Haskell, it is idiomatic to specify arguments that are unlikely to change in between function calls first.

For example, let’s look at the type of M.insertWith:

import qualified Data.Map as M

M.insertWith
  :: Ord k
  => (a -> a -> a)  -- ^ Merge values
  -> k              -- ^ Key to insert
  -> a              -- ^ Value to insert
  -> M.Map k a      -- ^ Map to insert into
  -> M.Map k a      -- ^ New map

This function allows us to insert an item into a map, or if it’s already there, merge it with an existing element. When we’re doing something related to counting items, we can “specialize” this function by partially applying it to obtain a function which adds a count:

increaseCount
  :: Ord k
  => k            -- ^ Key to increment
  -> Int          -- ^ Amount to increment
  -> M.Map k Int  -- ^ Current count
  -> M.Map k Int  -- ^ New count
increaseCount = M.insertWith (+)

And then we can do things like increaseCount "apples" 4 basket. The extremely succinct definition of increaseCount is only possible because functions in Haskell are always considered curried: every function takes just one element.

Sockets, Handles and more

However – there is a second idiomatic aspect of argument ordering. For imperative code, it is common to put the “object” or “handle” first. base itself is ripe with examples, and packages like network hold many more:

-- From System.IO
hSetBuffering
  :: Handle -> BufferMode -> IO ()
hGetBuf
  :: Handle -> Ptr a -> Int -> IO Int

-- From Control.Concurrent.Chan
writeChan
  :: Chan a -> a -> IO ()

-- From Control.Concurrent.MVar
modifyMVar
  :: MVar a -> (a -> IO (a, b)) -> IO b

This allows us to easily partially apply functions to a specific “object”, which comes in useful in where clauses:

writeSomeStuff :: Chan String -> IO ()
writeSomeStuff c = do
  write "Tuca"
  write "Bertie"
  write "Speckle"
 where
  write = writeChan c

In addition to that, it allows us to replace the type by a record of functions – as I went over in the handle pattern explanation.

Specializing top-level handle functions

However, we end up in a bit of a bind when we want to write succinct top-level definitions, like we did with increaseCount. Imagine we have a Handle to our database:

data Handle = Handle

Some mock utility types:

data Tier = Free | Premium
type MemberId = String

And a top-level function to change a member’s plan:

changePlan
  :: Handle
  -> Tier       -- ^ New plan
  -> String     -- ^ Comment
  -> MemberId   -- ^ Member to upgrade
  -> IO ()
changePlan = undefined

If we want a specialized version of this, we need to explicitly name and bind h, which sometimes feels a bit awkward:

halloweenPromo1 :: Handle -> MemberId -> IO ()
halloweenPromo1 h = changePlan h Premium "Halloween 2018 promo"

We sometimes would like to be able to write succinct definitions, such as:

halloweenPromo2 :: Handle -> MemberId -> IO ()
halloweenPromo2 = specialize changePlan Premium "Halloween 2018 promo"

But is this possible? And what would specialize look like?

Since this is a feature that relates to the type system, it is probably unsurprising that, yes, this is possible in Haskell. The concept can be represented as changing a function f to a function g:

class Specialize f g where
  specialize :: f -> g

Of course, a function can be converted to itself:

instance Specialize (a -> b) (a -> b) where
  specialize = id

Furthermore, if a Handle (a below) is the first argument, we can skip that it the converted version and first supply the second argument, namely b. This leads us to the following definition:

instance Specialize (a -> c) f => Specialize (a -> b -> c) (b -> f) where
  specialize f = \b -> specialize (\a -> f a b)

This is a somewhat acceptable solution, but it’s not great:

type errors from incorrect usage of Specialize will be hard to read
AllowAmbiguousInstances may required to defer instance resolution to the call site of specialize

Again, not show stoppers, but not pleasant either.

Flippin’ partial application

The unpleasantness around specialize is mainly caused by the fact that we need a typeclass to make this work for multiple arguments. Maybe using some sort of combinator can give us a simpler solution?

Because we’re lazy, let’s see if GHC has any ideas – we’ll use Typed holes to get a bit more info rather than doing the work ourselves:

halloweenPromo3 :: Handle -> MemberId -> IO ()
halloweenPromo3 =
  changePlan `_` Premium `_` "Halloween 2018 promo"

We get an error, and some suggestions:

posts/2019-10-15-flip-specialize.lhs:152:18: error:
 • Found hole:
     _ :: (Handle -> Tier -> String -> MemberId -> IO ()) -> Tier -> t0
   Where: ‘t0’ is an ambiguous type variable
 • In the expression: _
   In the first argument of ‘_’, namely ‘changePlan `_` Premium’
   In the expression:
     changePlan `_` Premium `_` "Halloween 2018 promo"
 • Relevant bindings include
     halloweenPromo3 :: Handle -> MemberId -> IO ()
       (bound at posts/2019-10-15-flip-specialize.lhs:151:3)
   Valid hole fits include
     flip :: forall a b c. (a -> b -> c) -> b -> a -> c
       with flip @Handle @Tier @(String -> MemberId -> IO ())
       (imported from ‘Prelude’ at posts/2019-10-15-flip-specialize.lhs:1:1
        (and originally defined in ‘GHC.Base’))
     seq :: forall a b. a -> b -> b
       with seq @(Handle -> Tier -> String -> MemberId -> IO ()) @Tier
       (imported from ‘Prelude’ at posts/2019-10-15-flip-specialize.lhs:1:1
        (and originally defined in ‘GHC.Prim’))
     const :: forall a b. a -> b -> a
       with const @(Handle -> Tier -> String -> MemberId -> IO ()) @Tier
       (imported from ‘Prelude’ at posts/2019-10-15-flip-specialize.lhs:1:1
        (and originally defined in ‘GHC.Base’))
 ...

Wait a minute! flip looks kind of like what we want: it’s type really converts a function to another function which “skips” the first argument. Is it possible that what we were looking for was really just… the basic function flip?

halloweenPromo4
  :: Handle -> MemberId -> IO ()
halloweenPromo4 =
  changePlan `flip` Premium `flip` "Halloween 2018 promo"

We can make the above pattern a bit cleaner by introducing a new operator:

(/$) :: (a -> b -> c) -> (b -> a -> c)
(/$) = flip

halloweenPromo5 :: Handle -> MemberId -> IO ()
halloweenPromo5 =
  changePlan /$ Premium /$ "Halloween 2018 promo"

Fascinating! I was aware of using flip in this way to skip a single argument (e.g. foldr (flip M.increaseCount 1)), but, in all the time I’ve been writing Haskell, I hadn’t realized this chained in a usable and nice way.

In a way, it comes down to reading the type signature of flip in two ways:

```
flip :: (a -> b -> c) -> (b -> a -> c)
```
Convert a function to another function that has the two first arguments flipped. This is the way I am used to reading flip – and also what the name refers to.
```
flip :: (a -> b -> c) -> b -> (a -> c)
```
Partially apply a function to the second argument. After supplying a second argument, we can once again supply a second argument, and so on – yielding an intuitive explanation of the chaining.

It’s also possible to define sibling operators //$, ///$, etc., to “skip” the first N arguments rather than just the first one in a composable way.

Update: Dan Dart pointed out to me that the sibling operators actually exist under the names of -$, --$, etc. in the composition-extra package.

Should I use this everywhere?

… probably not? While it is a mildly interesting trick, unless it becomes a real pain point for you, I see nothing wrong with just writing:

halloweenPromo6 :: Handle -> MemberId -> IO ()
halloweenPromo6 h = changePlan h Premium "Halloween 2018 promo"

The ZuriHac registration system

Jasper Van der Jeugt — Tue, 03 Sep 2019 00:00:00 UT

Introduction

I am one of the organizers of ZuriHac, and last year, we hand-rolled our own registration system for the event in Haskell. This blogpost explains why we decided to go this route, and we dip our toes into its design and implementation just a little bit.

I hope that the second part is especially useful to less experienced Haskellers, since it is a nice example of a small but useful standalone application. In fact, this was more or less an explicit side-purpose of the project: I worked on this together with Charles Till since he’s a nice human being and I like mentoring people in day-to-day practical Haskell code.

In theory, it should also be possible to reuse this system for other events – not too much of it is ZuriHac specific, and it’s all open source.

Why?

Before 2019, ZuriHac registration worked purely based on Google tools and manual labor:

Google Forms for the registration form
Google Groups to contact registrants
Google Sheets to manage the registrants, waitlist, T-Shirt numbers, …

Apart from the fact that the manual labor wasn’t scaling above roughly 300 people, there were a number of practical issues with these tools. The biggest issue was managing the waiting list and cancellations.

You see, ZuriHac is a free event, which means that the barrier to signing up for it is (intentionally and with good reason!) extremely low. Unfortunately, this will always result in a significant amount of people who sign up for the event, but do not actually attend. We try compensating for that by overbooking and offering cancellations; but sometimes it turns out to be hard to get people to cancel as well – especially if it’s hard to reach them.

Google Groups is not great for the purpose we’re using it for: first of all, attendees actually need to go and accept the invitation to join the group. Secondly, do you need a Google Account to join? I still don’t know and have seen conflicting information over the years. Anyway, it’s all a bit ad-hoc and confusing.

So one of the goals for the new registration system (in addition to reducing work on our side) was to be able to track participant numbers better and improve communication. We wanted to work with an explicit confirmation that you’re attending the event; or with a downloadable ticket so that we could track how many people downloaded this ¹.

I looked into a few options (eventbrite, eventlama, and others…) but none of these ticked all the boxes: aside from being free (since we have limited budget). Some features that I wanted were:

complete privacy for our attendees
a custom “confirmation” workflow, or just being able to customize the registration flow in general
and some sort of JSON or CSV export option

With these things in mind, I set out to solve this problem the same the way I usually solve problems: write some Haskell code.

How?

The ZuriHac Registration system (zureg) is a “serverless” application that runs on AWS. It was designed to fit almost entirely in the free tier of AWS; which is why I, for example, picked DynamoDB over a database that’s actually nice to use. We used Brendan Hay’s excellent and extensive amazonka libraries to talk to AWS.

The total cost of having this running for a year, including during ZuriHac itself, totaled up to 0.61 Swiss Francs so I would say that worked out well price wise!

There are two big parts to the application: a fat lambda ² function that provides a number of different endpoints, and a bunch of command line utilities that talk to the different services directly.

All these parts, however, are part of one monolithic codebase which makes it very easy to share code and ensure all behaviour is consistent – globally coherent as some would call it. One big “library” that has well-defined module boundaries and multiple lightweight “executables” is how I like to design applications in Haskell (and other languages).

Building and deploying

First, I’d like to go into how the project is built and compiled. It’s not something I’m proud of, but I do think it makes a good cookbook on how to do things the hard way.

The main hurdle is that we wanted want to run our Haskell code on Lambda, since this is much cheaper than using an EC2 instance: the server load is very bursty with long periods (days up to weeks) of complete inactivity.

I wrote a bunch of the zureg code before some Haskell-on-Lambda solutions popped up, so it is all done from scratch – and it’s surprisingly short. However, if I were to start a new project, I would probably use one of these frameworks:

Converting zureg to use of these frameworks is something I woulld like to look into at some point, if I find the time. The advantage of doing things from scratch, however, is that it serves the educational purposes of this blogpost very well!

Our entire serverless framework is currently contained in a single 138-line file.

From a bird’s eye view:

We define a docker image that’s based on Amazon Linux – this ensures we’re using the same base operating system and system libraries as Lambda, so our binary will work there.
We compile our code inside a docker container and copy out the resulting executable to the host.
We zip this up together with a python script that just forwards requests to the Haskell process.
We upload this zip to S3 and our cloudformation takes care of setting up the rest of the infrastructure.

I think this current situation is still pretty manageable since the application is so small; but porting it to something nicer like Nix is definitely on the table.

The database

The data model is not too complex. We’re using an event sourcing approach: this means that our source of truth is really an append-only series of events rather than a traditional row in a database that we update. These events are stored as plain JSON, and we can define them in pure Haskell:

lib/Zureg/Model.hs

And then we just have a few handwritten functions in the database module:

lib/Zureg/Database.hs

This gives us a few things for free; most importantly if something goes wrong we can go in and check what events led the user to get into this invalid state.

This code is backed by the eventful and eventful-dynamodb libraries, in addition to some custom queries.

The lambda

While our admins can interact with the system using the CLI tooling, registrants interact with the system using the webapp. The web application is powered by a fat lambda.

Using this web app, registrants can do a few things:

Register for the event (powered by a huge web 1.0 form using digestive-functors);
View their ticket (including a QR code generated by qrcode;
Confirm their registration;
Cancel their registration.

In addition to these routes used by participants, there’s a route used for ticket scans – which we’ll talk about next.

The scanning

Now that we have participant tickets, we need some way to process them at the event itself.

scanner.js is a small JavaScript tool that does this for us. It uses the device’s webcam to scan QR codes – which is nice because this means we can use either phones, tablets or a laptop to scan tickets at the event, the device just needs a modern browser version. It’s built on top of jsQR.

The scanner intentionally doesn’t do much processing – it just displays a full-screen video of the webcam and searches for a QR code using an external library. Once we get a hit for a QR code, we poll the lambda again to retrieve some information (participant name, T-Shirt size) and overlay that on top of the video.

This is useful because now the people working at the registration desk can see, as demonstrated in the image above, that I registered too late and therefore should only pick up a T-Shirt on the second day.

What is next?

There is a lot of room for improvement, but the fact that it had zero technical issues during registration or the event makes me very happy. Off the top of my head, here are some TODOs for next years:

We should have a CRON-style Lambda that handles the waiting list automation even further.
It should be easier for attendees to update their information.

Other than that, there are some non-functional TODOs:

Can we make the build/deploy a bit easier?
Should we port zureg to use one of the existing Haskell-on-Lambda frameworks?
I’m currently using somewhat fancy image scaling to get a sharp scaled up QR image, but this does not work if someone saves it on their phone – we should just do the scaling on the backend.

Any contributions in these areas are of course welcome!

Lastly, there’s the question of whether or not it makes sense for other events to use this. I discussed this briefly with Franz Thoma, one of the organizers of Munihac, who expressed similar gripes about evenbrite.

As it currently stands, zureg is not an off-the-shelf solution and requires some customization for your event – meaning it only really makes sense for Haskell events. On the other hand, there are a few people who prefer doing this over mucking around in settings dashboard that are hugely complicated but still do not provide the necessary customization.

I realize this is a bit creepy, and fortunately it turned out not to be necessary since we could do the custom confirmation flow.↩︎
In serverless terminology, it seems to common to refer to lambdas that deal with more than one specific endpoint or purpose as “fat lambdas”. I think this distracts from the issue a bit, since it’s more important to focus on how the code works and whether or not you can re-use it rather than how it is deployed – but coming from a functional programming perspective I very much enjoy the sound of “fat lambda”.↩︎

Beeraffe

Jasper Van der Jeugt — Wed, 27 Feb 2019 00:00:00 UT

This weekend, I finished a silly little game in PureScript called Beeraffe. You can play it here and view the source code here. In this blogpost, I want to give some more background information on how this game came to be.

Why PureScript?

If I was going to build a game, I knew I wanted it to be web-based – there was no doubt in mind about this:

It makes it much easier to show it to other people; you can just share a link and they can be playing it in seconds.
It is inherently cross-platform, and with a little bit of extra attention it easily works on smartphones as well.
It is a good sandbox; you don’t need to ask people to install arbitrary executables on their system.

There are of course some downsides to web-based games as well. For me, the main disavantage is that the dominant language is still JavaScript (which I am not a big fan of, to put it mildly).

Fortunately there are a good number of languages that compile down to JavaScript these days. The two big contendors were Haskell (through GHCJS) and PureScript (I would go as far as calling PureScript a Haskell dialect, since they are so similar).

The big advantage of using GHCJS is that you’re able to run Haskell on the backend and on the frontend, so you can share common code.

However, I wanted to write a simple game without any sort of backend (which, of course, makes it significantly easier to host as well). PureScript produces vastly smaller JavaScript files, and I wanted to learn the language a bit to see how it compares with Haskell, so I decided to give that a try.

I did not consider Elm because it’s a bit further removed from Haskell, and my main focus was still building a game; not learning a new language. I have heard a lot of good things about it though, so maybe that’s what I should try next.

Original inspiration for the game

One of the last games I played was the remake of the masterpiece Katamari Damacy on the Nintendo Switch.

Inspired by Katamari Damacy, I wanted to make a 2D version that had a similar feeling to it. I decided relatively quickly that the core mechanic of the game would be to put different kinds of objects together in bizarre ways, hopefully amusing people along the way.

Putting sprites together

With that in mind, I immediately focused on this core mechanic since I wanted to know whether it could actually be fun or not.

I started by doing a simple exhaustive search over all the ways you can overlay two sprites, minimizing the average colour distance. This worked remarkably well, and I didn’t end up fine-tuning the results much more after that.

It did lead to some performance issues for larger sprites, so I fixed that by mipmapping: for larger sprites, I first do an exhaustive search at a much lower resolution, then I use these results to do a local search in that neighbourhood at higher resolutions. This is not guaranteed to give the best results; but that doesn’t matter too much for this game: we just want a good enough result.

I wanted to also try an approach based on simulated annealing but didn’t get around to it. If someone wants to try this, you’re more than welcome to make a contribution!

At this point, I was getting amusing results, but I wasn’t sure how to make this into a game yet. I didn’t want to make it into action game, and felt like a puzzle game would fit better. Then, I realized the comedic effect would be even better if I combined the names of the different sprites as well.

This automatically adds a sort of puzzle mechanic to the game as well, since you can now only merge certain objects.

Finding resources

This brought me to the next obstacle – I knew I would need a large number of consistent sprites to use as art in the game. I browsed around opengameart.org for a bit, but did not really find anything promising. I also did not want to pay an artist, because I wanted to keep this a free game, without advertisements and the like.

Then it dawned to me that there already is a great collection of consistent sprites that even come with the names attached to them – emoji! I found the free EmojiOne set and started with that. But when I looked into it a bit, I found this weird snippet in their free licensing info:

3.4 What can’t you do with the JoyPixels/EmojiOne Properties under this agreement?
…
(I) Include properties in open source projects.
…

What nonsense is this? I am allowed to use it in my non-commercial project if I give attribution, but not if I want to have the option to open source my game?

This pissed me off and I started looking for alternatives. At that point, however, I already knew emoji were a good direction so it was easier. I ended up switching to Google’s Noto font. I liked the sprites a little bit less but at least the license made sense.

Making it a real game

At this point I built a demo that simply allowed you to drag around a bunch of different objects and merge them. It was certainly amusing, but it did not really feel like a “game” to me yet. However, I shared this demo with a couple of people and they all really liked it. This was very encouraging.

The next weekend, I tried to turn this into a Tetris- or 2048-like puzzle game, but this ended up being very confusing and not that much fun. Ironically, the non-game was more fun!

So, I decided to go back to that and just add a very simple economy on top of it (buying and selling things) to make it a bit more interesting. After I added that, I was quite happy with the flow of the game.

The rules were still a bit unclear to people I showed it to (what things can you merge together?), so I added the hints at the top of the cards and an interactive tutorial.

Closing thoughts on PureScript

In retrospect, I am happy with PureScript as a language and would recommend it if you’re looking into putting a simple no-backend web-based game together, and you already know Haskell.

There were a few issues I ran into with the language:

I still prefer lazy languages, and this bit me a few times. In particular, I wrote a few monadic recursive functions without being aware of the tailrec package. This caused stack overflows in my code, but I only saw these on my phone, which made it extremely hard to debug.
The error messages that the compiler emits are horrible at times. I feel like this is an area where I could contribute a bunch of code myself, but I’m not sure if I’ll ever have time for that.

There are also a lot of things I like:

Working with the FFI to call JavaScript is seamless and easy.
Halogen is an amazing framework that made building the UI trivial.
Once you figure out how to, the resulting JavaScript is actually very easy to debug using Firefox’s or Chromium’s developer tools.