Markdown

How do you write a parser in a functional language like Clojure? (That's a rhetorical question.) There are parser libraries for Haskell I could use as reference but they're still a bit over my head at this point.

The original Markdown.pl parser in Perl uses global hashes and regex-mangles strings directly. I could actually duplicate this exactly in Clojure, because Clojure isn't purely functional. But I'm trying to do it in a more functional way, and so far it's working out OK.

One of the bad things about Markdown is that perhaps because it's originally implemented in Perl as a bunch of regex-replacements on a string, and not as a real parser with a proper grammar, all of the implementations of Markdown in various other languages give slightly different results. So much so that someone wrote a website just to compare different implementations of Markdown against each other. So now writing a parser in Clojure, I face the difficulty of which behavior I want to duplicate. Some things Markdown does less than ideal, but I think I have to err on the side of replicating the original. One implementation, Pandoc, claims to be "more accurate" than Markdown.pl, but Pandoc seems to purposefully break from things that are laid out explicitly in the specification, which is bad.

January 31, 2009 @ 3:27 AM PST
Cateogory: Programming

3 Comments

Foob
Quoth Foob on February 01, 2009 @ 1:26 PM PST

There are still a few longstanding problems with the original Markdown that are keeping it greater success. For one thing there's no tables or definition lists. For the others, you might search the archives of the Markdown.pl mailing list.

These issues are why some MD implementations have taken it upon themselves to add syntax for the missing pieces. Pandoc's additions are particularly sensible, and the project itself seems to be first rate.

Ivar Refsdal
Quoth Ivar Refsdal on February 02, 2009 @ 1:28 AM PST

I don't know much about Java+Clojure, but you could give http://www.antlr.org/ a go.

Brian
Quoth Brian on February 02, 2009 @ 4:31 AM PST

Foob: I just read through the mailing list. Yeah that looks like a horrible mess. Pandoc might be the way to go.

Ivar: That actually looks promising, thanks.

Speak your Mind

You can use Markdown in your comment.
Email/URL are optional. Email is only used for Gravatar.

Preview