Published on March 30, 2011 under the tag haskell
What is this
I’ve recently released Hakyll 3, and it seems to have reached a certain form of stability now. The documentation is getting better, especially after Benedict Eastaugh ported the examples from Hakyll 2.x to 3.
However, recently it was brought to my attention that hacking on Hakyll is quite difficult – all source code is relatively clean and well-commented but there is no global overview available anywhere. This is what I hope to fix with this blogpost: I will attempt to give a whirlwind tour of the Hakyll internals, and a high-level overview of how it all works together.
The Hakyll module namespace is divided into two large groups:
Hakyll is essentially a compilation system (from this perspective, it looks a little like the Makefiles we all love and hate). On the other hand, it is mostly aimed at creating static websites. So, the distinction is pretty simple:
Hakyll.Core: everything related to the compilation system/dependency; management
Hakyll.Web: everything related to web sites.
An important constraint of this I imposed onto myself is that a module located
Hakyll.Core can absolutely never depend on a module in
This is what
Hakyll.Core looks like from a high-level point of view (the
arrows represent “using” relations):
Apart from having the coolest name, this module is probably also the central
module in Hakyll (for future reference, when I say
Module, I usually mean
It exposes the
Compiler a b arrow, which is, from a high-level point of view
composed out of two things:
- A (Kleisli) arrow from
b: this what will actually produce whatever you want. This could, for example, produce a website page from a markdown file.
- A function producing a dependency set, which lists all dependencies used in the arrow mentioned above.
This module can be called “the runtime system” of Hakyll. It is the module which
actually runs a
Compiler. Running happens in two phases:
- We run the dependency functions for all compilers. From this, we can infer a dependency graph.
- Using this information, we now run the necessary compiler arrows (i.e. the items which are out-of-date).
Other modules in Hakyll.Core
Those are not the only modules in
Hakyll.Core. A quick listing of some other
Identifiertype which is used to globally identify different compilers;
Hakyll.Core.DirectedGraphimplements a data structure used for dependency analysis. However, in the future, this might move to
Hakyll.Core.DependencyAnalyzeror even a separate package;
Hakyll.Core.ResourceProviderprovides an interface for basically reading files from the disk. But it is written as an abstract data type so one could add, for example, a backend which allows Hakyll to fetch data from a SQL database;
Hakyll.Core.Rulesexports a monad for the rules DSL. This DSL is quite simple, it’s basically a
Writermonad which yields a list of compilers;
Hakyll.Core.Routescontains the types and functions used for routing. This is a very simple module for now, altough extensions have been suggested;
Hakyll.Core.Configurationexports the global
HakyllConfiguration. This is a relatively small configuration data type, which I think is a good thing;
Hakyll.Core.Storeimplements a simple key/value store which allows you to save types instantiating
Hakyll.Web modules are more loosely coupeled, they all provide some
specific feature which helps the user in creating static websites. For example,
Hakyll.Web.CompressCss module provides CSS compression.
Hakyll.Web.Template modules are more tightly
integrated and a little more tricky (more in the next section).
I think most hacking opportunities lay in
Hakyll.Web: there’s probably a whole
range of filter-like compilers I haven’t thought of yet.
The life of page
I want to finish this blogpost by shedding some more light on the process of rendering a page (it’s probably the most commonly used feature of Hakyll).
Usually, a page is compiled using
pageCompiler. This is nothing more but a
“sane default”, with a pretty simple definition:
pageCompiler :: Compiler Resource (Page String) = readPageCompiler pageCompiler >>> addDefaultFields >>> arr applySelf >>> pageRenderPandoc
The first step is
readPageCompiler – this is an arrow defined as:
readPageCompiler :: Compiler Resource (Page String) = getResourceString readPageCompiler >>> arr readPage
This makes sense –
getResourceString simply gets the resource contents (i.e.
the file contents) as a
readPage is a pure function which parses
String into a
Page. If you want to have a certain text transformation on
the entire file, you need to replace
readPageCompiler by a custom arrow (which
will probably look like
getResourceString >>> custom >>> arr readPage).
The second step is
addDefaultFields. After parsing the
Page, it knows all
metadata fields specified in the actual file. But there’s other metadata we want
available as well: the URL of the page (
$$url$$), the source path
$$path$$), … all these fields are added here.
We’re going to fill up beatiful templates with these fields later, but we also
want to be able to use, say,
$$url$$ in the page itself. In order to
accomplish this, we use the
applySelf function, which applies a page as a
template to itself.
After all this is done, we use
pageRenderPandoc to render the page to HTML.
pageRenderPandoc, much like
pageCompiler is a sane default, it could be
pageRenderPandoc :: Compiler (Page String) (Page String) = pageReadPandoc pageRenderPandoc >>> pageWritePandoc
The actual definition is a little different, but certainly not harder. Again,
the point is that it’s a simple pipeline of some other arrows. If you want to
perform custom transformations on the pandoc document (this is pretty awesome,
since you can edit documents easily using a proper language and not just
regexes), it goes here:
pageReadPandoc >>> custom >>> pageWritePandoc. For
more information on defining these kind of pipelines, you should have a look at
I hope this gives some sort of idea on how to start hacking if you want to extend Hakyll with a certain feature. But then again, do not hesitate to poke me if you’re not sure, I’d be glad to help you get started.