89 Posts Tagged 'Ruby' RSS

Shuffle lines in Vim

In a pinch, I needed to randomize the order of a few thousand lines of plain text. In Linux you can just pipe the file through sort, even right inside Vim:

:%!sort -R

But I was stuck on Windows. And I don't know how to randomize a file in native Vim script. But doing it in Ruby is pretty easy, and luckily, Vim has awesome Ruby support. Tne minutes' work and a few peeks at :h ruby and we have a successful, working kludge:

function! ShuffleLines()
ruby << EOF
    buf = VIM::Buffer.current
    firstnum =  VIM::evaluate('a:firstline')
    lastnum = VIM::evaluate('a:lastline')
    lines = []
    firstnum.upto(lastnum) do |lnum|
      lines << buf[lnum]
    end
    lines.shuffle!
    firstnum.upto(lastnum) do |lnum|
      buf[lnum] = lines[lnum-firstnum]
    end
EOF
endfunction

2011-07-07 23:32 - Edited to remove a superfluous line.

2011-07-09 21:33 - Wrong parameter for sort, oops.

July 07, 2011 @ 7:07 AM PDT
Cateogory: Programming
Tags: Ruby, Vim

Keyword Arguments: Ruby, Clojure, Common Lisp

And suddenly I return to blogging, rising from the ashes like some kind of zombie phoenix. Turns out writing a book is a good absorber of time, like some sort of heavy-duty temporal paper towel. Now that I've gotten the terrible similes out of my system, let's talk about keyword arguments, one of my favorite features in any language that supports them.

Ruby, Clojure, and Common Lisp are all languages I enjoy to some degree, and they all have keyword arguments. Let's explore how keyword args differ in those languages.

June 24, 2011 @ 10:22 AM PDT
Cateogory: Programming
Tags: Lisp, Ruby, Clojure

Vim :ruby and :rubydo scope

Note to self. In old Vim (tested in 7.2.320), I could do this:

:ruby x='foo'
:rubydo $_=x

Now every line in the file says foo. But in Vim 7.3 I get an error:

NameError: undefined local variable or method `x' for main:Object

The scoping rules for Ruby in Vim must have changed somewhere along the line. I was abusing this feature to do some handy things, so this is sad.

A workaround is to use global variables in Ruby instead. So this still works:

:ruby $x='foo'
:rubydo $_=$x

Phew.

August 31, 2010 @ 3:40 AM PDT
Cateogory: Programming
Tags: Ruby, Vim

Clojure, from a Ruby perspective

Fogus' recent article "clojure.rb" speculates about why there seem to be so many Ruby users adopting Clojure. As a Ruby user who adopted Clojure, I figured I'd write about my experiences.

What do Ruby and Clojure have in common, that would attract a Rubyist to Clojure? A lot. Obviously, this is somewhat subjective and I don't expect anyone else to agree, but this is what did it for me.

June 09, 2010 @ 6:22 AM PDT
Cateogory: Programming
Tags: Ruby, Clojure

Getting list of referers out of Apache logs

I use Google Analytics, but it has a noticeable lag in updating its information. When my site is being hammered, I'd like to see where all the traffic is coming from. It'd also be nice to see how many hits my RSS feed is getting, and how many images and static files are being direct-linked, which Google Analytics currently isn't tracking for me at all.

So this script will look in my Apache logs and print referers for some URL, thanks to ApacheLogRegex:

#!/usr/bin/ruby

require 'apachelogregex'

raise "USAGE: #{$0} log_filename desired_url" unless ARGV[0] and ARGV[1]

format = '%v:%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"'
parser = ApacheLogRegex.new(format)
pat = Regexp.new(ARGV[1])
refs = {}

File.readlines(ARGV[0]).each do |line|
  x = parser.parse(line)
  if pat.match(x["%r"])
    r = x["%{Referer}i"]
    refs[r] = (refs[r] || 0) + 1
  end
end
refs.sort_by{|k,v| -v}.each do |ref,count|
  puts "%s: %s" % [count,ref]
end

I used to use awstats for this, but it was too heavyweight and a hassle to set up and keep running. Google Analytics is a no-brainer to use, even though the accuracy isn't as good as parsing Apache logs. At least I get an idea of which of my blatherings people are most interested in.

February 21, 2010 @ 4:46 AM PST
Cateogory: Programming

Clojure ORM-ish stuff

Suppose I have this:

user> (def foo [{:id 1 :foo 123} {:id 2 :foo 456}])
#'user/foo
user> (def bar [{:foo_id 1 :bar 111} {:foo_id 1 :bar 222}])
#'user/bar

What I want is to "join" foo and bar so that each item in foo ends up with a sub-list of bars based on matching key fields.

In real life, these lists-of-hash-maps are coming out of a database via clojure.contrib.sql, so this is something I actually want to do pretty often. This is also vaguely similar to what you get out of a Rails-like ORM, where you end up with an object that has lists of sub-objects anywhere you have a one-to-many relationship.

Here's how I end up doing this in Clojure:

(defn one-to-many
  ([xs name ys f]
      (for [x xs :let [ys (filter (partial f x) ys)]]
        (assoc x name ys))))

Now I can do this:

user> (pprint (one-to-many foo :bars bar #(= (:id %1) (:foo_id %2))))
({:bars ({:foo_id 1, :bar 111} {:foo_id 1, :bar 222}), :id 1, :foo 123}
 {:bars (), :id 2, :foo 456})

And if I define a helper function:

(defn key=
  ([xkey ykey]
     #(= (xkey %1) (ykey %2))))

Then I can write it more concisely:

user> (pprint (one-to-many foo :bars bar (key= :id :foo_id)))
;; same as above

And if I have another "table" of data like this:

user> (def baz [{:foo_id 1 :baz 555} {:foo_id 2 :baz 999}])
#'user/baz

Then I can join them all like this:

user> (pprint (-> foo
                  (one-to-many :bars bar (key= :id :foo_id))
                  (one-to-many :bazzes baz (key= :id :foo_id))))
({:bazzes ({:foo_id 1, :baz 555}),
  :bars ({:foo_id 1, :bar 111} {:foo_id 1, :bar 222}),
  :id 1,
  :foo 123}
 {:bazzes ({:foo_id 2, :baz 999}), :bars (), :id 2, :foo 456})

This is pretty concise. It may be possible to do it in an even more concise way, (if so, do share). If I was willing to adhere to some Rails-y naming convention for my table names and for the id fields in my tables, I could make this shorter by not having to specify the names of the id fields, but I don't want to go there. It's trivial to write similar functions for a one-to-one relationship, or to use a join-table to "join" two tables with a many-to-many relationship.

I am happily surprised sometimes by how simple it is to roll my own version of things that previously seemed like dark magic. I used Rails for a long time and it seemed like a crapload of code must have gone into making the ORM work. But four lines of code gets me 75% of what I ever needed Rails' ORM for.

This may be more thanks to me opening my eyes a bit than to Clojure being awesome, but either way, I'll take it.

November 03, 2009 @ 10:21 AM PST
Cateogory: Programming

Happy 2nd Birthday Clojure

Clojure is two years old and it's looking good. Clojure development has been a bit quiet lately but that's because lots of big changes are apparently being worked on behind the scenes, for example rewriting Clojure in Clojure (and enhancing the Java-Clojure interop along the way, to help make this more possible). Meanwhile clojure-contrib continues to grow and the community continues to be vibrant.

I've been putting Clojure to good use at work in data munging and reporting. I've got data in an MS SQL Server database on one box (not by choice, I assure you) and a mysql database on another box, and then there's a bunch of data files in wide variety of other formats floating around the network. I use Clojure to query and compare it all.

Thanks to JDBC and clojure.contrib.sql it's easy to slurp data from a DB into a bunch of Clojure hash-maps. Thanks to clojure.contrib.duck-streams and Clojure's good regex support, the same is true of data files in general.

In the past I'd have written some enormous SQL queries to generate reports, but it's so much easier to use Clojure's wide array of sequence-manipulation functions to manipulate hash-maps. Doing what I want is rarely more difficult than mapping some transformation over the data, filtering out the data I want, and then formatting it nicely (which is easy thanks to Common Lisp-style formatting).

And once I notice patterns in how I'm using those things, I write a few functions and macros to make it more concise. Consicion is one area where Lisps cannot be beaten (short of APL). For example, give me all data from the mysql db "data4" which is collected at "Site1" and happened before 2008, and group it by the person who collected it:

user> (group-by :collector 
                (filter #(and (= (:site %) "Site1")
                              (date-before? (:date %) 
                                            (date-from-string "2008-01-01")))
                        (mysql-data :data4)))

date-from-string isn't a standard function but here's how easy it is to write it, thanks to the JVM:

(defn date-from-string [s]
  (.parse (java.text.SimpleDateFormat. "yyyy-mm-DD") s))

That's pretty much how my data-querying looks. To some that probably looks terrifying, but thanks to Emacs+Paredit it's a few handy keystrokes to type auto-complete and manipulate and automatically pretty-format such things, and thanks to a suitably large dose of Lisp brand Kool-Aid I find it very natural and comfortable to read at this point.

Then I can (csv ...) it or (plaintext-table ...) it or whatever. I replaced thousands of lines of Ruby and SQL queries with a few hundred lines of Clojure this way, and the Clojure version does more and does it better.

One of the things I like about Clojure is that it's such a small language, you can be reasonably sure you know the whole language (or at least have some passing familiarity with all parts of it) once you've read the docs and the API a few times. This is in contrast to languages like Common Lisp for example where the standard is thick enough to be considered a deadly weapon. Java lurks underneath but you can get away with ignoring it almost entirely, until those rare times you need it.

Alan Perlis said:

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures

This is true, but Clojure takes it further: It's even better to have a lot of functions that act on one abstraction of a bunch of data structures. Clojure gives you a bunch of data structures that can all be accessed under the same seq abstraction, and a bunch of functions that work on that abstraction, and the end result is more than the sum of the parts, because everything is interchangeable in lovely ways. I bounce data between sets and hash-maps and vectors without even thinking about it half the time.

That's just one reason of many that I love Clojure nowadays. I get crap done with it, quickly and easily and with surprising amounts of fun. Thanks Clojure and Clojure devs for making my life better.

October 17, 2009 @ 4:24 PM PDT
Cateogory: Programming
Tags: Ruby, Clojure

Now I have two problems

I'm converting one of my websites from Ruby on Rails to Clojure in my spare time. I stupidly put a bunch of RoR-style links inline into certain bits of plaintext content, so in my DB there are a bunch of text fields with <%= link_to ... %> in the middle.

It was easy to fix with a regex though:

(defn clean [txt]
  (re-gsub #"<%=\s*link_to\s+(\"[^\"]+\"|'[^']+')\s*(?:,\s*'([^']+)'\s*)?(?:,\s*image_path\(['\"]([^'\"]+)['\"]\)\s*)?(?:,\s*:controller\s*=>\s*(?::(\S+)|['\"]([^\"']+)['\"])\s*)?(?:,\s*:action\s*=>\s*(?::(\S+)|['\"]([^\"']+)['\"])\s*)?(?:,\s*:id\s*=>\s*(?:(\d+)|:(\S+)|['\"]([^\"']+)['\"])\s*)?\s*%>"
           (fn [[_ s & parts]] (let [href (str-join "/" (filter identity parts))]
                           (str "<a href=\"/" href "\">" (re-gsub #"^[\"']|[\"']$" "" s) "</a>")))
           txt))

And by easy, I mean not easy.

Note to self, try something other than a regex next time.

Note to self, don't bury some framework's funky-syntax DSL in the middle of plaintext content. Next time use HTML or do the conversion from DSL to HTML early rather than late.

Silly how two years ago I thought I'd be using Ruby for that site forever.

September 25, 2009 @ 4:02 PM PDT
Cateogory: Programming

Plamsa + Ruby = Ouch

I wrote my first KDE4 plasmoid the other day. I can't release it because it's essentially a clone of something you aren't allowed to copy (maybe I can replace him with a penguin and release it that way though).

But I need to rewrite it first anyways, because I did it using the Ruby bindings for Qt4 and Plasma, and wow it's painful. It has a 50/50 shot of even initializing at any given point. When it does initialize, it has about a 1 on 8 chance of immediately crashing Plasma. And some things I just can't get to work at all, e.g. setting a default size or resizing the applet programmatically; X-Plasma-DefaultSize in the metadata is supposed to do it but it does nothing. And it's not just my system (using KDE 4.3), because I tried it on a Kubuntu machine using stable KDE 4.2 and had the same problems.

The other snag is that the documentation of the Plasma API is buried so deep on the KDE site that I don't even know how I found it. Here it is for those who care (and for my own future reference). I hit lots of dead links on the KDE site on the way there.

Next step is to rewrite the plasmoid in Python or C++ I guess.

September 08, 2009 @ 4:34 PM PDT
Cateogory: Programming
Tags: KDE, Ruby, Plasma

Practicality: PHP vs. Lisp?

Eric at LispCast wrote an article about why PHP is so ridiculously dominant as a web language, when arguably more powerful languages like Common Lisp linger in obscurity.

I think the answer is pretty easy. In real life, practicality usually trumps everything else. Most programmers aren't paid to revolutionize the world of computer science. Most programmers are code monkeys, or to put it more nicely, they're craftsmen who build things that other people pay them to create. The code is a tool to help people do a job. The code is not an end in itself.

In real life, here's a typical situation. You have to make a website for your employer that collects survey data from various people out in the world, in a way that no current off-the-shelf program quite does correctly. If you could buy a program to do it that'd be ideal, but you can't find a good one, so you decide to write one from scratch. The data collection is time-sensitive and absolutely must start by X date. The interface is a web page, and people are going to pointy-clicky their way through, and type some numbers, that's it; the backend just doesn't matter. For your server, someone dug an old dusty desktop machine out of a closet and threw Linux on there for you and gave you an SSH account. Oh right, and this project isn't your only job. It's one of many things you're trying to juggle in a 40-hour work week.

One option is to write it in Common Lisp. You can start by going on a quest for a web server. Don't even think about mod_lisp, would be my advice, based on past experience. Hunchentoot is good, or you can pay a fortune for one of the commercial Lisps. If you want you could also look for a web framework; there are many to choose from, each more esoteric, poorly documented and nearly impossible to install than the last. Then you get to hunt for a Lisp implementation that actually runs those frameworks. Then you get to try to install it and all of your libraries on your Linux server, and on the Windows desktop machine you have to use as a workstation. Good luck.

Once you manage to get Emacs and SLIME going (I'm assuming you already know Emacs intimately, because if you don't, you already lose) you get to start writing your app. Collecting data and moving it around and putting it into a database and exporting it to various statistics packages is common, so you'd do well to start looking for some libraries to help you out with such things. In the Common Lisp world you're likely not to find what you need, or if you're lucky, you'll find what you need in the form of undocumented abandonware. So you can just fix or write those libraries yourself, because Lisp makes writing libraries from scratch easy! Not as easy as downloading one that's already been written and debugged and matured, but anyways. Then you can also roll your own method of deploying your app to your server and keeping it running 24/7, which isn't quite so easy. If you like, you can try explaining your hand-rolled system to the team of sysadmins in another department who keep your server machine running.

Don't bet on anyone in your office being able to help you with writing code, because no one knows Lisp. Might not want to mention to your boss that if you're run over by a bus tomorrow, it's going to be impossible to hire someone to replace you, because no one will be able to read what you wrote. When your boss asks why it's taking you so long, you can mention that the YAML parser you had to write from scratch to interact with a bunch of legacy stuff is super cool and a lovely piece of Lisp code, even if it did take you a week to write and debug given your other workload.

Be sure to wave to your deadline as it goes whooshing by. If you're a genius, maybe you managed to do all of the above and still had time to roll out a 5-layer-deep Domain Specific Language to solve all of your problems so well it brings tears to your eye. But most of us aren't geniuses, especially on a tight deadline.

Another option is to use PHP. Apache is everywhere. MySQL is one simple apt-get away. PHP works with no effort. You can download a single-click-install WAMP stack nowadays. PHP libraries for everything are everywhere and free and mature because thousands of people already use them. The PHP official documentation is ridiculously thorough, with community participation at the bottom of every page. Google any question you can imagine and you come up with a million answers because the community is huge. Or walk down the hall and ask anyone who's ever done web programming.

The language is stupid, but stupid means easy to learn. You can learn PHP in a day or two if you're familiar with any other language. You can write PHP code in any editor or environment you want. Emacs? Vim? Notepad? nano? Who cares? Whatever floats your boat. Being a stupid language also means that everyone knows it. If you jump ship, your boss can throw together a "PHP coder wanted" ad and replace you in short order.

And what do you lose? You have to use a butt-ugly horrid language, but the price you pay in headaches and swallowed bile is more than offset by the practical gains. PHP is overly verbose and terribly inconsistent and lacks powerful methods of abstraction and proper closures and easy-to-use meta-programming goodness and Lisp-macro syntactic wonders; in that sense it's not a very powerful language. Your web framework in PHP probably isn't continuation-based, it probably doesn't compile your s-expression HTML tree into assembler code before rendering it.

But PHP is probably the most powerful language around for many jobs if you judge by the one and only measure that counts for many people: wall clock time from "Here, do this" to "Yay, I'm done, it's not the prettiest thing in the world but it works".

The above situation was one I experienced at work, and I did choose PHP right from the start, and I did get it done quickly, and it was apparently not too bad because everyone likes the website. No one witnessed the pain of writing all that PHP code, but that pain doesn't matter to anyone but the code monkey.

If I had to do it over again I might pick Ruby, but certainly never Lisp. I hate PHP more than almost anything (maybe with the exception of Java) but I still use it when it's called for. An old rusty wobbly-headed crooked-handled hammer is the best tool for the job if it's right next to you and you only need to pound in a couple of nails.

September 21, 2008 @ 6:17 PM PDT
Cateogory: Programming
Tags: Lisp, PHP, Ruby, Rant