Controlling Chromium in Haskell

Driving Chromium using its built-in WebSockets server
Posted in: haskell.

Introduction

Chromium logo

The chromium browser has a built-in WebSockets server which can be used to control it. I first heard about this through an issue with my WebSockets project.

Since I am currently rewriting the WebSockets library to have it use io-streams, I wanted to give this a try and made it into a blogpost. As a sidenote – the port is going great, io-streams is a very nice library and I managed to solve a whole lot of issues in the library (mostly exception handling stuff).

Note that this code uses a yet unreleased version of the WebSockets library – you can get it from the github repo though, if you want to: check out the io-streams branch.

This file is written in literate Haskell so we will have a few boilerplate declarations and imports first:

{-# LANGUAGE OverloadedStrings #-}
module Main
( main
) where
import Control.Applicative ((<$>))
import Control.Monad (forever, mzero)
import Control.Monad.Trans (liftIO)
import Data.Aeson (FromJSON (..), ToJSON (..), (.:), (.=))
import qualified Data.Aeson as A
import qualified Data.Map as M
import Data.Maybe (fromMaybe)
import qualified Data.Text.IO as T
import qualified Data.Vector as V
import qualified Network.HTTP.Conduit as Http
import qualified Network.URI as Uri
import qualified Network.WebSockets as WS

Locating the WebSockets server

To enable the WebSockets server, chrome must be launched with the --remote-debugging-port flag enabled:

chromium --remote-debugging-port=9160

Now, in order to connect to the built-in WebSockets server, we have to know its URI and this requires some extra code. We will first use cURL to demonstrate this:

$ curl localhost:9160/json
[ {
   "description": "",
   ...
   "webSocketDebuggerUrl": "ws://localhost:9160/devtools/page/8937C189-5CED-8E34-E26E-A389641FE8FF"
} ]

That webSocketDebuggerUrl is the one we want. Let us write some Haskell code to automate obtaining it.

We create a datatype to hold this info. Currently, we are only interested in a single field:

data ChromiumPageInfo = ChromiumPageInfo
{ chromiumDebuggerUrl :: String
} deriving (Show)

We will use aeson to parse the JSON. We need a FromJSON instance for our datatype:

instance FromJSON ChromiumPageInfo where
parseJSON (A.Object obj) =
ChromiumPageInfo <$> obj .: "webSocketDebuggerUrl"
parseJSON _ = mzero

The http-conduit library can be used to do what we just did using curl:

getChromiumPageInfo :: Int -> IO [ChromiumPageInfo]
getChromiumPageInfo port = do
response <- Http.withManager $ \manager -> Http.httpLbs request manager
case A.decode (Http.responseBody response) of
Nothing -> error "getChromiumPageInfo: Parse error"
Just ci -> return ci
where
request = Http.def
{ Http.host = "localhost"
, Http.port = port
, Http.path = "/json"
}

One remaining issue is that the JSON contains the WebSockets URL as a single string, and the WebSockets library expects a (host, port, path) triple. Luckily for us, the standard network library has a Network.URI module which makes this task pretty simple:

parseUri :: String -> (String, Int, String)
parseUri uri = fromMaybe (error "parseUri: Invalid URI") $ do
u <- Uri.parseURI uri
auth <- Uri.uriAuthority u
let port = case Uri.uriPort auth of (':' : str) -> read str; _ -> 80
return (Uri.uriRegName auth, port, Uri.uriPath u)

Once we are connected to Chromium, we will be sending commands to it. A simple Haskell datatype can be used to model these commands:

data Command = Command
{ commandId :: Int
, commandMethod :: String
, commandParams :: [(String, String)]
} deriving (Show)

We use the aeson library again here, to convert these commands into JSON data:

instance ToJSON Command where
toJSON cmd = A.object
[ "id" .= commandId cmd
, "method" .= commandMethod cmd
, "params" .= M.fromList (commandParams cmd)
]

What is left is a simple main function to tie it all together.

main :: IO ()
main = do
(ci : _) <- getChromiumPageInfo 9160
let (host, port, path) = parseUri (chromiumDebuggerUrl ci)
WS.runClient host port path $ \conn -> do
-- Send an example command
WS.sendTextData conn $ A.encode $ Command
{ commandId = 1
, commandMethod = "Page.navigate"
, commandParams = [("url", "http://haskell.org")]
}
-- Print output to the screen
forever $ do
msg <- WS.receiveData conn
liftIO $ T.putStrLn msg

Conclusion

This is a very simple example of what you can do with Haskell and Chromium, but I think there are some pretty interesting opportunities to be found here. For example, I wonder if it would be possible to create a simple Selenium-like framework for web application testing in Haskell.

Thanks to Gilles J. for a quick proofread and Ilya Grigorik for this inspiring blogpost!

ce0f13b2-4a83-4c1c-b2b9-b6d18f4ee6d2