This is a read-only archive!

Getting list of referers out of Apache logs

I use Google Analytics, but it has a noticeable lag in updating its information. When my site is being hammered, I'd like to see where all the traffic is coming from. It'd also be nice to see how many hits my RSS feed is getting, and how many images and static files are being direct-linked, which Google Analytics currently isn't tracking for me at all.

So this script will look in my Apache logs and print referers for some URL, thanks to ApacheLogRegex:


require 'apachelogregex'

raise "USAGE: #{$0} log_filename desired_url" unless ARGV[0] and ARGV[1]

format = '%v:%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"'
parser =
pat =[1])
refs = {}

File.readlines(ARGV[0]).each do |line|
  x = parser.parse(line)
  if pat.match(x["%r"])
    r = x["%{Referer}i"]
    refs[r] = (refs[r] || 0) + 1
refs.sort_by{|k,v| -v}.each do |ref,count|
  puts "%s: %s" % [count,ref]

I used to use awstats for this, but it was too heavyweight and a hassle to set up and keep running. Google Analytics is a no-brainer to use, even though the accuracy isn't as good as parsing Apache logs. At least I get an idea of which of my blatherings people are most interested in.

February 21, 2010 @ 4:46 AM PST
Cateogory: Programming


Quoth Evaryont on February 23, 2010 @ 11:02 AM PST

I've just found your blog, and I'm sad I hadn't before. Great posts! More random blatherings about KDE, Ruby, and vim would be nice. :P

What do you do to keep yourself motivated / inspired to blog so much for so long? I'm trying to get a blog going myself, but I just...don't. I avoid blogging, which I personally rather not.

Look at me blathering, yay!

I am so stealing your dotfiles, btw. I love that. Just finding & integrating & improving my dotfiles. Lovely hobby. School gets in the way, however.

Quoth Brian on February 23, 2010 @ 4:32 PM PST

I'm motivated to write for the same reason everyone seeks the company of other people. I don't know the reason. Probably genetic. Thanks for reading.