Controlling Chromium in Haskell
Published on September 1, 2013 under the tag haskell
Introduction
The chromium browser has a built-in WebSockets server which can be used to control it. I first heard about this through an issue with my WebSockets project.
Since I am currently rewriting the WebSockets library to have it use io-streams, I wanted to give this a try and made it into a blogpost. As a sidenote – the port is going great, io-streams is a very nice library and I managed to solve a whole lot of issues in the library (mostly exception handling stuff).
Note that this code uses a yet unreleased version of the WebSockets library –
you can get it from the github repo though, if you want to: check out the
io-streams
branch.
This file is written in literate Haskell so we will have a few boilerplate declarations and imports first:
{-# LANGUAGE OverloadedStrings #-}
module Main
( mainwhere )
import Control.Applicative ((<$>))
import Control.Monad (forever, mzero)
import Control.Monad.Trans (liftIO)
import Data.Aeson (FromJSON (..), ToJSON (..), (.:), (.=))
import qualified Data.Aeson as A
import qualified Data.Map as M
import Data.Maybe (fromMaybe)
import qualified Data.Text.IO as T
import qualified Data.Vector as V
import qualified Network.HTTP.Conduit as Http
import qualified Network.URI as Uri
import qualified Network.WebSockets as WS
Locating the WebSockets server
To enable the WebSockets server, chrome must be launched with the
--remote-debugging-port
flag enabled:
chromium --remote-debugging-port=9160
Now, in order to connect to the built-in WebSockets server, we have to know its URI and this requires some extra code. We will first use cURL to demonstrate this:
$ curl localhost:9160/json
[ {
"description": "",
...
"webSocketDebuggerUrl": "ws://localhost:9160/devtools/page/8937C189-5CED-8E34-E26E-A389641FE8FF"
} ]
That webSocketDebuggerUrl
is the one we want. Let us write some Haskell code
to automate obtaining it.
We create a datatype to hold this info. Currently, we are only interested in a single field:
data ChromiumPageInfo = ChromiumPageInfo
chromiumDebuggerUrl :: String
{deriving (Show) }
We will use aeson to parse the JSON. We need a FromJSON
instance for our
datatype:
instance FromJSON ChromiumPageInfo where
A.Object obj) =
parseJSON (ChromiumPageInfo <$> obj .: "webSocketDebuggerUrl"
= mzero parseJSON _
The http-conduit library can be used to do what we just did using curl
:
getChromiumPageInfo :: Int -> IO [ChromiumPageInfo]
= do
getChromiumPageInfo port <- Http.withManager $ \manager -> Http.httpLbs request manager
response case A.decode (Http.responseBody response) of
Nothing -> error "getChromiumPageInfo: Parse error"
Just ci -> return ci
where
= Http.def
request = "localhost"
{ Http.host = port
, Http.port = "/json"
, Http.path }
One remaining issue is that the JSON contains the WebSockets URL as a single
string, and the WebSockets library expects a (host, port, path) triple. Luckily
for us, the standard network library has a Network.URI
module which makes
this task pretty simple:
parseUri :: String -> (String, Int, String)
= fromMaybe (error "parseUri: Invalid URI") $ do
parseUri uri <- Uri.parseURI uri
u <- Uri.uriAuthority u
auth let port = case Uri.uriPort auth of (':' : str) -> read str; _ -> 80
return (Uri.uriRegName auth, port, Uri.uriPath u)
Once we are connected to Chromium, we will be sending commands to it. A simple Haskell datatype can be used to model these commands:
data Command = Command
commandId :: Int
{ commandMethod :: String
, commandParams :: [(String, String)]
,deriving (Show) }
We use the aeson library again here, to convert these commands into JSON data:
instance ToJSON Command where
= A.object
toJSON cmd "id" .= commandId cmd
[ "method" .= commandMethod cmd
, "params" .= M.fromList (commandParams cmd)
, ]
What is left is a simple main function to tie it all together.
main :: IO ()
= do
main : _) <- getChromiumPageInfo 9160
(ci let (host, port, path) = parseUri (chromiumDebuggerUrl ci)
$ \conn -> do
WS.runClient host port path -- Send an example command
$ A.encode $ Command
WS.sendTextData conn = 1
{ commandId = "Page.navigate"
, commandMethod = [("url", "http://haskell.org")]
, commandParams
}
-- Print output to the screen
$ do
forever <- WS.receiveData conn
msg $ T.putStrLn msg liftIO
Conclusion
This is a very simple example of what you can do with Haskell and Chromium, but I think there are some pretty interesting opportunities to be found here. For example, I wonder if it would be possible to create a simple Selenium-like framework for web application testing in Haskell.
Thanks to Gilles J. for a quick proofread and Ilya Grigorik for this inspiring blogpost!