5 Posts Tagged 'Markdown' RSS

Footnotes

Did you ever notice how footnotes make your writing seem more important1 somehow?

Maybe one reason is that "real" books use footnotes. At a glance, it looks like I have references2 backing up everything I say. In reality, I don't, but the connotation carries through somehow3. Now my blog seems scholarly and authoritative.

And if you're like me, you can't resist clicking footnotes to see what they refer to4. According to my estimates, by utilizing footnotes, in one fell swoop I have decreased my readers' average reading efficiency by 73%.

In any case, I've added experimental, rudimentary support for footnotes to cow-blog.

I'm loosely copying the syntax from Markdown Extra for this. Markdown is great, except when it isn't. The standard doesn't have support for some useful extensions. I use Showdown for Markdown support, and I'm probably going to work on adding more features of Markdown Extra to Showdown in the near future.

I just dread actually doing it. Showdown (like Markdown itself) is implemented as a series of hackish regex transformations of blobs of text. It's not a proper grammar. Implementing more of Markdown Extra means more regex blobbing. It's brittle and fragile and even getting incomplete support for footnotes was less than enjoyable. But at the same time I find myself wanting to do things that Markdown can't so, so I may have to bite the bullet.

(If there's a Showdown Extra out there already, drop me a URL. It'd be most appreciated. But I couldn't find one.)

  1. In reality nothing I say is important.

  2. Does my inner dialog count as a reference?

  3. Via telepathy.

  4. See?

July 27, 2010 @ 12:30 PM PDT
Cateogory: Programming

Vim vs. Emacs: Indenting text before copying

I use Markdown on my blog for posts and comments, and I post at other sites that use Markdown (e.g. Stack Overflow). In Markdown, text indented four spaces is displayed as code, in pre tags.

I find myself often writing code in Vim or Emacs and needing to copy/paste it into a browser in a Markdown-suitable way.it back. This is easy to do in Vim and Emacs, only a few keystrokes. But "a few" is still greater than "one", so the heck with that. Let's script it.

Vim version

This keymapping in Vim will do it all for me:

vmap <Leader>y :s/^/    /<CR>gv"+ygv:s/^    //<CR>

One clumsy thing about Vim is needing to restore the previous visual selection after each regex-replacement. I could use the marks '< and '> as ranges to :s instead, but that's more typing than simply doing gv in the mapping. Copying to the system clipboard is easy because Vim has a register "+ for that purpose.

This took me maybe 45 seconds to write, probably due to being pretty familiar with Vim already. But in Vim, mappings are easy. You just type the characters that you'd type if you were doing it manually.

Emacs version

Trying to do the same in Emacs was painful. My Emacs-fu is sorely inadequate, compared to my Vim-jitsu. This seems to work, but ugh:

;; adapted from http://www.emacswiki.org/emacs/.emacs-ChristianRovner.el
(defun expand-region-linewise ()
  (interactive)
  (let ((start (region-beginning))
        (end (region-end)))
   (goto-char start)
   (beginning-of-line)
   (set-mark (point))
   (goto-char end)
   (unless (bolp) (end-of-line))))

(defun markdown-copy ()
  (interactive)
  (save-window-excursion
   (save-excursion
     (save-restriction
       (expand-region-linewise)
       (narrow-to-region (region-beginning) (region-end))
       (goto-char (point-min))
       (replace-regexp "^" "    ")
       (clipboard-kill-ring-save (point-min) (point-max))
       (goto-char (point-min))
       (replace-regexp "^    " "")))))

Writing this involved a long journey through the Emacs documentation.

One difficulty was getting Emacs to play friendly with my fat-fingered region-marking. I don't always highlight from the beginning of the first line to the end of the last. That's why Vim's visual-line mode is awesome; the cursor can be anywhere on the line, it still selects the whole line. The handy function above (found on the Emacs wiki) takes care of that. I don't know how long it would've taken me to come up with that on my own.

Then it was a matter of rooting through millions of Emacs functions until I found the ones that move the point around and copy text to the clipboard. Along the way I discovered the wonders of "narrowing", which limits Emacs to work on some region of text, and all those macros to undo the messes I make while moving around.

Maybe I could've done this with an Emacs keyboard macro, and then called apply-macro-to-region-lines. And maybe I could use append-next-kill to build up the indented text one line at a time. But my efforts to do this or anything like it failed horribly.

In any case I thought it was an interesting comparison. Improvements to either version are welcome.

EDIT: This works too (thanks Holger Durer):

(defun markdown-copy ()
  (interactive)
  (save-excursion
    (expand-region-linewise)
    (indent-rigidly (region-beginning) (region-end) 4)
    (clipboard-kill-ring-save (region-beginning) (region-end))
    (indent-rigidly (region-beginning) (region-end) -4)))
May 13, 2010 @ 4:21 PM PDT
Cateogory: Programming

Clojure and Markdown (and Javascript and Java and...)

Writing up a blog replacement for Wordpress (in Clojure) is coming along nicely. Clojure + Compojure are awesome. Most fun I've had making a website in a long while.

One problem I've run across is that I want to use Markdown for both post content and visitor commenting. I like Stack Overflow's live Javascript previews, so you can type text in Markdown and see what it'll look like as HTML, as you type it.

As I mentioned before, Markdown is very nice because it (partly) solves one longstanding issue I've had with programming blogs (including my own), namely the proper escaping of HTML and the proper formatting of source code. In Markdown you just put code in backquotes or indent it four spaces and there you go, properly escaped. Markdown is also easy and to type and read, which is a plus. I hate hate hate writing HTML by hand.

Anyways, there is no Markdown parser for Clojure, so I was going to write one. (There is MarkdownJ but it has unresolved issues.) The problem with writing my own Markdown parser in Clojure is that Markdown is not a well-specified language. There is no "official" grammar, just an informal "Here's how it works" description and a really ugly reference implementation in Perl. Most implementations (with the exception of peg-markdown and pandoc and friends) are implemented as a bunch of global regex-replacements passed repeatedly over some text.

The result is that there are a lot of Markdown parsers in a lot of languages, and they all give slightly different results in a lot of corner cases (and a lot of not-so-corner cases). The best I could do in Clojure is pick one implementation and try my best to match it.

Now, for each blog post, the server needs to store both the Markdown text and the HTML text. It needs the Markdown because if someone wants to edit content later, they need to edit the raw Markdown. It needs the HTML so that it can be cached and served to people viewing the website, obviously.

But a consequence of the above mess is that if you use a Javascript Markdown library (i.e. Showdown) to show a live preview, and then use a different Markdown library (my own or any other) to do server-side parsing of the text after it's POSTed, there's a good chance that the preview isn't going to match the real output.

One non-solution to this is to do all the parsing client-side, and POST both the Markdown and the post-Markdown HTML to the server so both can be stored, so no server-side parsing is necessary. Aside from being a horrid idea, it's a huge security risk because it leaves open the possibility of someone POSTing some clean Markdown along with some evil, un-matching HTML.

Another non-solution is to do all the parsing server-side and use AJAX to send the preview back to the client. That wouldn't be nearly as smooth or responsive as I want; on Stack Overflow for example the preview updates instantly after every keyup event in the textarea.

The ideal solution is use the same Javascript library client-side for previews, and server-side for parsing the text. Then the preview and content have a very high chance of matching. This requires some way to run Javascript on the server. Thanks to Clojure and Java and Rhino, this turns out to be trivial.

(ns bcc.markdown
  (:import (org.mozilla.javascript Context ScriptableObject)))

(defn markdown-to-html [txt]
  (let [cx (Context/enter)
        scope (.initStandardObjects cx)
        input (Context/javaToJS txt scope)
        script (str (slurp "showdown.js")
                    "new Showdown.converter().makeHtml(input);")]
    (try
     (ScriptableObject/putProperty scope "input" input)
     (let [result (.evaluateString cx scope script "<cmd>" 1 nil)]
       (Context/toString result))
     (finally (Context/exit)))))

This also saves me from having to write a Markdown parser in Clojure, for which I am thankful.

Once again I'm also thankful we live in times when CPU cycles are cheap and abundant. I'm running a Markdown parser, in Javascript, in Java, via Clojure, and this still runs essentially instantly even for very large input strings. If I had any chance of my blog becoming famous and getting a million hits a day, it might matter, but in real life I'm set.

February 22, 2009 @ 2:12 PM PST
Cateogory: Programming

Markdown

How do you write a parser in a functional language like Clojure? (That's a rhetorical question.) There are parser libraries for Haskell I could use as reference but they're still a bit over my head at this point.

The original Markdown.pl parser in Perl uses global hashes and regex-mangles strings directly. I could actually duplicate this exactly in Clojure, because Clojure isn't purely functional. But I'm trying to do it in a more functional way, and so far it's working out OK.

One of the bad things about Markdown is that perhaps because it's originally implemented in Perl as a bunch of regex-replacements on a string, and not as a real parser with a proper grammar, all of the implementations of Markdown in various other languages give slightly different results. So much so that someone wrote a website just to compare different implementations of Markdown against each other. So now writing a parser in Clojure, I face the difficulty of which behavior I want to duplicate. Some things Markdown does less than ideal, but I think I have to err on the side of replicating the original. One implementation, Pandoc, claims to be "more accurate" than Markdown.pl, but Pandoc seems to purposefully break from things that are laid out explicitly in the specification, which is bad.

January 31, 2009 @ 11:27 AM PST
Cateogory: Programming

Blog replacement fun

So I'm still thinking how I want to replace this blog. I still plan to write something from scratch, for fun's sake.

One thing I'm sure of is that I don't want to write HTML by hand, at all, under any circumstances. HTML and XML are not human-writable or human-readable languages. They rely too much on things human beings suck at, namely consistency and repetition. Forget a closing tag? Typo a tag name? Now your document is malformed. Undefined behavior, at best. It's too verbose, it has too much needless punctuation.

It's also too hard to manipulate it or do anything with it after you write it. There's XPath, which is itself a mess to work with, manipulating huge strings of crap via slightly smaller strings with its own funky syntax quirks. I've never found an XML-parsing library with an interface that I liked, and I've had to use them extensively in Perl, Ruby and Python.

So first thing, I'm going to convert all posts and comments into Markdown and use that for future posting and commenting. I like Markdown. It's hard to get wrong typing it by hand and it doesn't get in your way. It also doesn't tie you to one implementation; you can turn Markdown into HTML client-side via Javascript or easily parse it server-side. Or you can display it as-is and it's still readable. It's a very nice idea.

Second thing, I plan to use my programming language to write the HTML for the skeleton of my site for me. Opening, closing, and properly nesting tags is something a machine should do for me. Making sure my tags belong to a well-defined list of allowed valid HTML tags is something a machine should check for me.

More than likely I'm going to write this in Clojure, because s-expressions (and better yet, a combination of Clojure literal lists, maps and arrays) makes writing HTML very easy and foolproof. I've also written an HTML-producing DSL in Ruby in the past though; it's not hard to do in any language.

Another thing I'm sure of is that I need a good anti-spam system but that I have no idea what that system should be. Askimet in Wordpress has caught 50,000(!) spam comments since I started my blog. Some spam still sneaks through on me now and then. I've never used a CAPTCHA and don't plan to; they just don't work. I'm probably going to come up with some funky custom anti-spam measures (which are invisible to users) and rely on the fact that no one is going to take the time to break it. My site isn't a huge or popular target, so here's hoping.

A third complication I'm dreading is how to do this without breaking every link anyone ever made to my site. Wordpress's permalink system is OK, but I'd like to change it. Problem is I can't change it; every link to my site from another site is a dependency. So I might have to mod_rewrite redirect the old URLs to new ones, or use two permalink schemes simultaneously. I don't know.

Fun times ahead. How to design a blog is a problem lots of people have solved but no one has really solved perfectly, or else there wouldn't be so many frameworks and packages to do it. The good thing about writing your own from scratch is that it'll work exactly how you want. Wordpress is close but not close enough.

January 15, 2009 @ 9:55 PM PST
Cateogory: Programming