This is a read-only archive!

Fun with iterators

Given this:

text = 'blah blah ##REPLACE1 ##REPLACE2 ##REPLACE3 blah blah'

repl = {
    '##REPLACE1' => 'foo',
    '##REPLACE2' => 'bar',
    '##REPLACE3' => 'foobar'
}

Which of these works as expected?

text.scan(/##\w+/) do |tag|
    text.gsub!(tag, repl[tag])
end

or

text.scan(/##\w+/).each do |tag|
    text.gsub!(tag, repl[tag])
end

The answer (as perhaps expected) is the second one. Both use a C function scan_once which as expected iterates through the string position-by-position. The second version collects all the matches into an array and lets you iterate over that; the first keeps iterating over the string itself, and if you edit the string, the index that scan_once has been using will point to a different place than it should.

It's probably in general not well-defined what happens when you start editing a string you're iterating through. Well, it's possibly well-defined but probably not what you'd ever want. I can't recall reading anything about it in Ruby but I can't imagine any way of anything good coming from fiddling with the guts of things as you iterate over them. I learned this lesson yesterday, the fun way.

May 11, 2007 @ 4:54 AM PDT
Cateogory: Programming
Tags: Ruby