Ruby hashes with custom objects as keys

When you're storing things in hashes, obviously you need a hash function to turn keys into numbers (or memory locations, or whatever), so you know which bucket gets which values. This hash function is nicely defined for Fixnums; two Fixnums give the same hash value no matter what, which makes sense since Fixnums are immutable, so two objects with the same Fixnum value are pretty much the same object in every way. Strings are mutable, but String#hash always returns the same hash value for two strings even if they're different objects (i.e. have different object_id's), apparently by using the string's length and contents in some way.

Things becomes screwy if you have your own class and you want to use objects of that class as hash keys though. From what I can tell, if a class doesn't define it's own method called hash, then Object#hash defaults to using an object's object_id as the hash value.

Why would you want to use your own class's objects as hash keys? Well, I got in trouble because Array#uniq happens to use that same hash function to determine uniqueness, and I want two objects with the same values for some subset of their instance methods to be considered non-unique. The default Object#hash doesn't do this.

It's not as simple as defining your own hash method; the documentation for Object#hash says:

This function must have the property that a.eql?(b) implies a.hash == b.hash

So Object#eql? is also apparently used by hashes somewhere along the way. The moral of this story is, if you want to use your objects as hash keys or ever plan to uniq an array containing them, you have to define a hash and eql? method. This code illustrates this:

def test(o1,o2)
	h = Hash.new
	h[o1] = true
	h[o2] = true
 
	puts "o1.object_id: #{o1.object_id}"
	puts "o2.object_id: #{o2.object_id}"
	puts "o1.hash: #{o1.hash}"
	puts "o2.hash: #{o2.hash}"
	puts "o1.eql? o2: #{o1.eql? o2}"
	puts "o1.value: #{o1.value}"
	puts "o2.value: #{o2.value}"
	puts "o1.value.object_id: #{o1.value.object_id}"
	puts "o2.value.object_id: #{o2.value.object_id}"
	puts "o1.value.hash: #{o1.value.hash}"
	puts "o2.value.hash: #{o2.value.hash}"
	puts "h.keys.length: #{h.keys.length}"
	puts "[o1,o2].uniq: #{h.keys.uniq}"
	puts "[o1,o2].uniq.length: #{h.keys.uniq.length}"
	puts
end
 
class Foo
	attr_reader :value
	def initialize(value)
		@value = value
	end
end
 
f1 = Foo.new('123')
f2 = Foo.new('123')
 
test(f1,f2)
 
class Foo
	def hash
		@value.hash
	end
end
 
test(f1,f2)
 
class Foo
	def eql?(other)
		@value.eql? other.value
	end
end
 
test(f1,f2)
 
test = 123
test2 = 123
 
puts test.object_id
puts test2.object_id
Tags:

Leave a Reply

You can use these tags in comments (Note: HTML is automatically escaped inside <pre> tags, nowhere else, so if you post source code, put it in <pre>):

<pre lang="some_programming_language"> 
<em>
<strong>
<a href="url">

NOTE: Comments are automatically spam-filtered. If your comment fails to appear, it was likely munched by the filter. Try not to link-spam or post anything that looks like it was typed by a robot.