I use Wordpress for my blog. A database is used to store the posts themselves. Wordpress uses PHP to move data into and out of the database. Obviously the blog itself is displayed as HTML.
This means there are potentially three levels of escaping and un-escaping done to my text after I type text into this lovely TEXTAREA and before it's actually displayed. It's HTML-escaped in certain cases, like turning
< (I just typed a raw
< character there, note, and it was HTML-escaped automatically); it's SQL-escaped so it doesn't break whatever INSERT command is putting it into the tables; it's PHP-escaped along the way I'm sure. All my newlines are magically turned into
<br> tags or
<p> tags somewhere along the line too. Etc. etc.
And yet, if I type a backslash in my post, by the time it's fed through PHP, into the SQL table, fetched back out, and displayed in my blog, it will have vanished entirely. This is a problem when I post source code in my blog, for example, where backslashes are pretty common. Where along this long line of escaping and un-escaping of text is my poor backslash lost? I don't know.
Only ecently have I discovered that putting code into
<code> tags will cause Wordpress do to even more escaping of special characters than usual, so that my backslashes DO survive the round trip through the system. Little did I know that, unlike in
<pre> tags, newlines embedded in
<code> tags are happily translated into
<p> tags just like everywhere else. This messes things up quite a bit if you set
white-space: pre in your stylesheet for
<code> tags, for example; you'll end up with lots of extraneous HTML tags everywhere. So it turns out I had to write a Wordpress filter as a very last step in this huge mess, to undo the replacement of newlines with HTML that occurs inside