9 Posts Tagged 'Compojure'
Well, my new blog is up and running. Sorry for the temporary lack of cows in my layout. I'm dogfood-testing the blog engine in a fairly vanilla state until I work out some of the bugs. This layout is based upon barecity, a minimalist Wordpress theme that I adapted easily enough to my blog.
As a bonus, I applied a dirty hack to my RSS feed that I think should help avoid screwing up people's RSS readers with duplicate entries.
I'll write again soon with some info about the blog engine and some things I learned writing it.
(As mentioned previously, here's the code.)
I apologize in advance to everyone who subscribes to my blog's RSS feed, but this week your RSS reader is probably going to suddenly find 25 "new" posts from me.
My blog currently uses
/blog/title as the URL scheme, with similar URLs for categories and tags etc. Soon, I'm probably going to change it to
/blog/123/title, as part of the impending release of version 0.2 of my blog engine. (The code-in-progress is in a branch on github, for the daring and foolish among you.)
This way, I can change the title of a post without breaking everything. I have heretofore lacked this ability. It's easy to code, you just tell Compojure to ignore everything after the number in a route. Something like this:
(defroutes foo (GET ["/blog/:id:etc" :id #"\d+" :etc #"(/[^/]*)?"] [id] (pages/post-page id)))
It's only a few lines of code to change, but the ramifications are widespread. It'll instantly break every link to my blog, for example. At least it's pretty easy to set up a bunch of redirects in Compojure to avoid that. I think this'll work:
(require (blog [db :as db] [link :as link]) (oyako [core :as oyako]) (ring.util [response :as response])) (defn redirect-post [name] (when-let [post (oyako/fetch-one db/posts :url name)] (response/redirect (link/url post)))) (defroutes redirect-routes (GET ["/blog/:name" :name #"[^/]+$"] [name] (redirect-post name)))
(Oyako here is the experimental ORM-like library I'm using to interface with PostgreSQL nowadays, having ditched Tokyo Cabinet.)
Changing my URL scheme is also going to mess up RSS though, because I (foolishly) used post URLs as the GUIDs in my RSS feed up to this point. This problem I don't know how to avoid. I might reduce the number of posts included in my feed temporarily, to limit the damage.
On my server I'm running one Java process, which handles four of my websites on four different domains. These are all running on Clojure + Compojure. Some people asked for details of how to do this, so here's a rough outline. For the sake of brevity I'm only going to talk about two domains here, though it scales up to however many you want pretty easily.
This is surely not the only way to do this, and probably not the best way, but it's what I've arrived at after a year of goofing off.
Summary: Emacs + SLIME + Clojure running in GNU Screen; all requests are handled by Apache and
mod_proxy sends them to the appropriate Jetty instance / servlet.
I haven't posted here much recently because I've been hacking on another recently-sort-of-completed website. One of my favorite hobbies is old 8-bit video games. The first thing I ever programmed was a website about Final Fantasy for the old NES, and I've fiddled with it for the past 10 years or so.
A while back I decided to rewrite the whole thing using Clojure + Compojure with data in mysql. This went really well. I know lines of code isn't that great a metric, but it can give a rough estimate: this whole website is done in 3,400 lines of Clojure, which includes all of the HTML "templates" and the DB layer I had to write. And it's
I suspect the target audience of this blog and the target audience of that website don't overlap that much, but I figured someone might be interested in some of the detail of how it's implemented. A few things I learned...
This code still isn't polished enough for someone to drop it on a server and fire it up, but maybe it'll give someone some ideas. I think the new code is cleaner and it'll be easier for me to add features now.
Beware bugs, I'm positive I introduced some.
EDIT: A word about the CRUD library... persisting data to disk is hard when the data may be mutated by many threads at once and the destination for your data is an SQL database that may or may not even be running. I have more respect for people who've written libraries that actually do this kind of thing and work right. Granted I only spent 3 days on mine but still, it's tricky.
I gave up for a while and tried clj-record, but it was prohibitively slow. It has the old N+1 queries problem when trying to select an object which has N sub-objects. In real life you'd write SQL joins to avoid such things. Ruby on Rails on the other hand gets around this via some nasty
I get around it by having all my data in a Clojure ref in RAM already so it doesn't matter. And by using hooks so each object keeps a list of its sub-objects and the list is always up-to-date (updates of sub-objects propagate to their parents). But the crap I have to do to get this to just barely work is pretty painful.
By popular demand, I've released the source code for my blog. Hope someone finds it useful.
Feedback and bug reports welcome, email me or post them somewhere on my blog and I'll find them.
So far my silly anti-spam measures are working. Since last week I've had 1861 spam comment attempts, of which 0 were successful. 1857 of them didn't even alter the text my the captcha text field at all. Four of them inexplicably HTML-escaped the
< into a
One feature I didn't implement from Wordpress is subscribing to comments via email. Sending an email from Java is possible but a little bit painful to implement. The Javamail API is a monster.
I do think it's useful to be able to know when someone responds to comment you left, but is spamming your inbox really the best way? I have to think there's a better way.
I did implement an RSS feed for each individual post's comments. And separate RSS feeds for all the tags on my blog, and all the categories. When RSS feeds are generated dynamically, why not? This is all of the code for the tag feeds:
(defn tag-rss [tagname] (if-let [tag (get-tag tagname)] (rss (str "briancarper.net Tag: " (:name tag)) (str "http://briancarper.net/" (:url tag)) "briancarper.net" (map rss-item (take 25 (all-posts-with-tag tag)))) (error-404 )))
Plus the routing code:
(GET "/feed/tag/:name" (tag-rss (route :name)))
But I haven't uploaded the comment-feed feature because I don't know if it's overkill. Personally I am liberal with my RSS feeds, I just pop them into my Akregator and off I go. But I don't know if other people take their feeds more seriously, or what. RSS feeds can be a bit heavyweight. Maybe I should make a feed for all of my comments across all posts.
After I implemented that silly CAPTCHA yesterday, the spam was stopped. There's also a honeypot form field (it's hidden via CSS so humans don't know it's there, and if any bot POSTs text for that field, the data is rejected automatically). It's silly and easily defeated, yet it stopped all 262 spam attempts since yesterday. It looks like all the spam is for one site, but it's coming from a huge range of IPs. So it's probably a botnet. Thanks, MS Windows!
I rewrote my whole CRUD layer so that I could use it for more than one database at once, and then rewrote my gallery code to take advantage, and now two hours later I have my origami gallery back up and running. Both sites are running from the same JVM. I wonder how many sites I can have going at once before the server melts into a puddle of Java-inflicted goo.
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11338 16 0 512m 128m 12m S 0 0.3 0:28.33 java
Good thing I have plenty of RAM on the server. From looking at before and after shots of the memory usage, 66 MB is the JVM itself, and 40MB more is Jetty and Compojure and my code and all the dependencies. Then the last ~20 MB or so is my database slurped into RAM. So I can probably fit another few tens of thousands of posts and comments in here before I have to worry much. The real test will be letting this thing run for a couple weeks and see how hard it leaks.
One fun thing about playing with Compojure is that it doesn't do much with HTTP headers for you, which is a good learning opportunity. RFC 2616 is rather helpful here.
For example I learned that if you don't set a
Expires header, your browser will happily re-fetch files over and over, which is a bit of performance hit. Static files that don't change often like images etc. can be set with a higher
Expires value so they're cached.
Another thing to keep in mind (note to self) is that using mod_proxy to forward traffic to a local Jetty server means that the "remote IP" you get from
(.getRemoteAddr request) will always be
127.0.0.1. If you want the user's real remote IP, you have to look in the
X-Forwarded-For header (easily accessed as
(:x-forwarded-for headers) in Compojure. Given that Identicons are generated from a hash of an IP address, this has resulted in some screwed up (wrongly identical) avatars for a bunch of people in posts for the past couple days. Oops. Not much I can do to fix that now.
In other non-news, I just the spam logging for the blog so I can see the kinds of things bots are doing to get around my feeble anti-spam measures. Sadly the spam seems to have stopped entirely, right after I set this up. How annoying.