This is a read-only archive!

Deploying Clojure websites

On my server I'm running one Java process, which handles four of my websites on four different domains. These are all running on Clojure + Compojure. Some people asked for details of how to do this, so here's a rough outline. For the sake of brevity I'm only going to talk about two domains here, though it scales up to however many you want pretty easily.

This is surely not the only way to do this, and probably not the best way, but it's what I've arrived at after a year of goofing off.

Summary: Emacs + SLIME + Clojure running in GNU Screen; all requests are handled by Apache and mod_proxy sends them to the appropriate Jetty instance / servlet.

Directory structure

First you have to decide on a directory structure. There's a sort of Clojure convention to have a src and deps directory, though Clojure is so young that this convention may or may not stick. Behold my ASCII line-art directory tree:

|    |
|    |----common/
|    |
|    |----net/
|         |
|         |----briancarper/
|         |    |
|         |    |----<lots of .clj files>
|         |
|         |----ffclassic/
|              |
|              |----<lots of .clj files>
|    |
|    |----<lots of .jar files>
|    |
|    |----public/
|    |
|    |----db/
|    |
|    |----apache/
|    |
|    |----public/
|    |
|    |----apache/

src holds the Clojure source code for all of my sites. Java (therefore Clojure) forces your directory structure to match your namespaces, and traditionally people use domain names in reverse order for both. Whether you like or dislike this convention, you must admit that if you're writing code that's actually going to be used to host websites on domains, the convention probably makes sense. There are some utility libraries I wrote that I like to share between sites, so I put those in some common namespace in their own directory.

deps holds a bunch of .jar files. This includes clojure.jar, clojure-contrib.jar, compojure.jar, swank-clojure.jar (for SLIME), and all the Java libraries I use, like Jetty, a bunch of Apache Commons libraries, Tokyo Cabinet, Rhino and so on. This means that all of my sites are sharing libraries, and therefore must be running the same version of Clojure and all other libraries. So when I upgrade a library that one site is using, I must upgrade all the other sites. This is sometimes a hassle, but it forces me not to be lazy.

Then I have one more directory for each of my domains, for static files and such. These directories have different subdirectories depending on how I'm running them. For the sites that use Tokyo Cabinet, I have a db directory to hold the database files. On some sites I want to run a PHP or CGI script, so I have a directory for Apache to use. All of them have a public folder, from which Clojure serves images, JS and CSS files etc.

Phil Hagelberg's recent Leiningen project can possibly help with some of this. I don't use Lein but I may switch someday.

Emacs + SLIME + CLASSPATH setup

Building or installing Clojure and Emacs and SLIME and friends is beyond the scope of this blog post. swank-clojure has come a long way in being auto-installable via ELPA nowadays, if you like.

The most important thing is to set up CLASSPATH correctly. src should be on the CLASSPATH, as well as every .jar file in deps. My .emacs contains this, among other things:

(setq swank-clojure-classpath (list "." "./src" "./deps/*"))
(setq swank-clojure-library-paths (list "/usr/local/lib"))

Java will take a glob and expand it to include every jar file in a directory, thankfully. But this seems to happen right when you start Java, so to add a new .jar to your CLASSPATH you generally need to restart the JVM. This is annoying, but oh well.

If you AOT-compile your Clojure code you might also have a classes folder on your classpath, but I don't bother. You could also compile your whole site into its own .jar file and use that. I don't do this because I want to be able to edit my Clojure source files on the server if I need to.

swank-clojure-library-paths is necessary for me because Tokyo Cabinet uses some C libraries. You might not need it.

Then when I start Emacs, I M-x cd to the base folder and start SLIME there. It'll find Clojure in deps and give me a REPL. If you can get this working, you're well on your way.

Clojure code

I have a file server.clj for each domain, which are in charge of setting up and starting / stopping the servlets. So in src/net/briancarper/server.clj I have something like:

(ns net.briancarper.server
  (:use (compojure.http request servlet session routes)
        (compojure.server jetty)
        (compojure control)
        (net.briancarper ...)))  ;; lots of files containing the guts of the site

(defroutes some-routes ...)
(defroutes other-routes ...)

(defroutes all-routes

(defserver blog-server
  {:port 8080 :host "localhost"}
  "/*" (servlet all-routes))

src/net/ffclassic/server.clj is similar, except with different routes (obviously) and a different port for the servlet. Note that we bind to localhost here, so that people can't connect directly to port 8080 and bypass Apache.

One of the routes should be set up to point to the proper public folder. Generally some catch-all route near the end of your list of routes, like:

(GET "/*" (static-file (params :*)))

where this static-file knows to serve files from, and the other server.clj for the other domains point to different directories.

Compojure has the compojure.http.helpers/serve-file function you can use to serve static files too. I use my own function for various reasons. Note that Compojure generally doesn't set any HTTP headers on your responses. You should probably set some kind of cache control headers unless you want your users to re-download all of your image files and stylesheets every time they hit a new page. You might also have to fiddle with Content-Type sometimes, though Compojure is usually pretty good about guessing them.

Note that Compojure's hands-off approach has its upsides too. For example, I have a bunch of CSS files for my blog. In Clojure, I read and concatenate all of those CSS files into one blob of text and have Clojure serve that as combined.css doesn't actually exist. I do the same for JS. This speeds up user requests quite a bit, without having to merge all the files on the filesystem. I could further compactify them if I cared.

If your Clojure app is handling arbitrary request URIs, you should be careful you aren't serving up ../../../../etc/passwd or something. Pretty sure Apache protects you against this by default, but Compojure uses compojure.http.helpers/safe-path? to test for it too if you use the built-in serve-file.

Finally, for convenience, in the base directory I have a file called server.clj which contains this:

(ns server
  (:require [net.briancarper server]
            [net.ffclassic server]))

(defn go []
  (.start net.briancarper.server/blog-server)
  (.start net.ffclassic.server/ffclassic-server))

Then, to start everything, from a REPL:

user> (require 'server)
2010-01-04 18:10:50.046::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
user> (server/go)
2010-01-04 18:11:02.260::INFO:  jetty-6.1.15
2010-01-04 18:11:02.358::INFO:  Started SocketConnector@localhost:8080
2010-01-04 18:11:02.707::INFO:  jetty-6.1.15
2010-01-04 18:11:02.789::INFO:  Started SocketConnector@localhost:8087

On the server (Apache)

To deploy this to my server I just rsync it all over SSH. On the server, I have Emacs and Java installed, as well as Apache and GNU Screen.

Why Apache? In case I want to run something that isn't Clojure, like webmail or some CGI script. Plus it's easy to run Clojure as a non-priviledged user on a non-priviledged port, and have Apache forward requests to the proper servlet based on the domain in the request. Apache is also nice for doing HTTPS.

My Apache setup is a pretty standard Debian-ish setup. You need mod_proxy installed. My Apache config for one domain looks a bit like this:

<VirtualHost *:80>

        <Directory /home/user/clj/>
           AllowOverride All
           Order Allow,Deny
           Allow from all

        DocumentRoot /home/user/clj/

        ProxyPass /foo !
        ProxyPass / http://localhost:8087/
        ProxyPassReverse / http://localhost:8087/

Yeah, that's about it. By default, every request will be passed to Clojure. The ports specified here should obviously match the ports you specify in your Clojure code for your servlets.

ProxyPass /foo ! tells Apache to handle requests to itself rather than pass them to Clojure. I override URIs on a case-by-case basis like this, to tell Apache to run whatever CGI or PHP scripts I need.

One oddity of running a setup like this is that thanks to mod_proxy, the "remote address" for every request is always going to be by the time they get to Clojure. So if you want to see the IP address of your users, you have to do something like:

(defn- ip [request]
  (or ((:headers request) "x-forwarded-for")
      (:remote-addr request)))

Another oddity is that sometimes if Apache is running, and then I start Clojure, I have to restart Apache before it "finds" Clojure and starts forwarding traffic. No big deal though.

This is otherwise transparent from the Clojure end.

Starting it up

You probably want to SSH to your server, start a REPL, start your code running, and then log off the server. There are various ways you can run keep Emacs running in the background. You could run Emacs in server mode and attach to it with emacsclient. I use GNU Screen because it's easy and I have other things running in Screen instances that I like to switch between. So I start Screen, start Emacs, start SLIME, and start my site from there. Then detach from Screen when I'm done. It's easier than it sounds.

Why SLIME and not a normal commandline REPL? Because in Emacs I can patch the code easily on a function-by-function basis, among other things. You can do this from a commandline REPL but not nearly as easily. Emacs is a much nicer place to type than any command line. When I need to make an update to the site or fix a bug, my workflow is typically:

  1. Fix it locally on my test machine, make sure it works
  2. rsync the changed files to the server
  3. SSH to the server
  4. screen -r to bring up Emacs
  5. Open the .clj file(s) I changed
  6. C-c C-c to recompile and reload the individual function(s) I changed, or C-c C-k to recompile and load a whole .clj file, or whatever.
  7. C-a d to detach from Screen

I can do any database inspecting or fiddling I need to do from the REPL. I can also do some live testing of the site or whatever else I want. It's good for debugging.

I've had Emacs running for months this way without having to restart it, it seems to work OK.

That's it

It's not that hard. It's clearly more steps than dumping some PHP files into a directory and having them work. But with Clojure, you get the benefit of a persistently running environment with a nice REPL interface, all in a nice fast multi-threaded JVM. Plus you get to write all your code in Clojure instead of PHP. In terms of fun and long-term maintenance costs, Clojure comes out way ahead in the end, for me anyways.

January 04, 2010 @ 12:42 PM PST
Cateogory: Programming


Quoth Shashy on January 04, 2010 @ 10:05 PM PST

Thanks for this write up - it is very helpful.

Quoth Aaron on January 05, 2010 @ 1:33 AM PST

I was just wondering how most people deploy Compojure, and I'm still surprised that just creating a .war deployment isn't the community standard. Did you choose this approach because you wanted a REPL on the server? Have you deployed more "traditional" Java web applications before?

Quoth Phil on January 05, 2010 @ 1:55 AM PST

If you rename the "deps" directory to "lib" then you can use M-x swank-clojure-project to manage your classpath.

Ryan Zezeski
Quoth Ryan Zezeski on January 05, 2010 @ 3:38 AM PST

Nice writeup Brian. FYI, in the future you could try the "tree" command to generate that ASCII art of yours.

@Aaron: I think, as you hint at, his reasoning for not doing a traditional J2EE deployment is because of the on-the-fly debugging you get with the straight Clojure/Emacs/Slime environment. I mean seriously, that's really figgen cool, and useful!

However, I wonder if you could just embed a server type REPL in the webapp itself and connect to that remotely? That would be even better, IMO, than having to ssh and use Emacs/Slime via Screen. Kind of like using tramp mode to remotely edit a file.

Quoth Brian on January 05, 2010 @ 6:22 AM PST

@Aaron: I don't know much about pure Java web apps or .war files or such things, that's the main reason I did it this way. It may or may not be better to use the .war option. Does it require writing an XML file or something? I strongly dislike XML configuration file magic. Plus having access to a REPL is a necessity.

@Phil: Thanks, that'll be helpful.

@Ryan: Darn it, yeah tree would've helped. Must remember that next time.

I wouldn't want to implement a REPL in a webapp or try to make it accessible remotely. I know how to secure SSH to a large degree. I like things to be hidden behind that, and encrypted traffic etc. I could probably open an SSH tunnel and use Emacs'/SLIME's socket-connection features, but that requires me to have Emacs installed on every computer I use, and from work (or from my cell phone) I like that I only need SSH to do everything I need to do.

Chris Bailey
Quoth Chris Bailey on January 07, 2010 @ 7:02 AM PST

Brian, do you have a feel for the basic memory footprint needed to run a server setup like this? Or, maybe more likely it'd be what is needed for a simple first app setup, and then how much more memory may be needed for each additional app? I realize a lot may depend on the particular app(s)'s needs, as well as the traffic (e.g. if very low traffic, and not a lot of simultaneous requests to multiple apps, etc.).

I haven't done Java stuff in a long time, and have gotten so used to the Ruby/Rails world (Nginx+Passenger, etc.) and how to compute that memory. My recollection was that Java server apps tended to need a fair bit more RAM. Mostly I just care so I can figure out how big of a VPS server I need to get from some a hosting company...

Quoth Brian on January 08, 2010 @ 5:43 AM PST

Not really, other than what I listed from top in this post. Seems to take up about 150 MB of RAM for four sites. I do a lot of caching in RAM on the server. I think my VPS has 540 MB of RAM on it and I haven't come close to that as far as I'm aware.

My sites aren't very big, your mileage may vary.

Quoth on June 02, 2011 @ 8:30 AM PDT


Quoth on June 02, 2011 @ 8:48 AM PDT

No not booo! Sorry, testing cow-blog locally and I typed in the wrong browser tab :)

Quoth see on June 28, 2012 @ 6:32 AM PDT

Hi to all, it's in fact a fastidious for me to visit this web site, it contains priceless Information.