Vim regexes are awesome

Two years ago I wrote about how Vim's regexes were no fun compared to :perldo and :rubydo. Turns out I was wrong, it was just a matter of not being used to them.

Vim's regexes are very good. They have all of the good features of Perl/Ruby regexes, plus some extra features that don't make sense outside of a text editor, but are nonetheless very helpful in Vim.

Here are a few of the neat things you can do.

Very magic

Vim regexes are inconsistent when it comes to what needs to be backslash-escaped and what doesn't, which is the one bad thing. But Vim lets you put \v to make everything suddenly consistent: everything except letters, numbers and underscores becomes "special" unless backslash-escaped.

Without \v:

:%s/^\%(foo\)\{1,3}\(.\+\)bar$/\1/

With \v:

:%s/\v^%(foo){1,3}(.+)bar$/\1/

Far easier to read. Along with \c to turn on and off case sensitivity, these are good options to make a habit of prepending to regexes when needed. It eventually becomes second-nature. See also :h /\v

Spanning newlines

One thing that :perldo and :rubydo can't do is span newlines; you can't combine two lines and you can't break one line into two.

But Vim's regexes can span newlines if you use \_. instead of .. I find this to be a lot more aesthetically pleasing than Perl's horrible s and m modifiers tacked onto the end of a regex. e.g. this strips <body> tags from a text document.

:%s@<body>\v(\_.+)\V</body>@\1@

(Note: in real life, never use a regex to parse HTML or XML. Down that path lies madness. The above is OK because I'd expect only one <body> tag to appear in any document.)

(Note^2: being able to turn on and off magic in the middle of a regex is awfully helpful.)

(Note^4: You can use arbitrary delimiters like @ for the regex, which is useful if your pattern includes literal /'s.)

See also :h \_.

\zs

Vim lets you demand that some text match, but ignore that text when it comes to the substitution part. This is handy for certain specific kinds of regexes. Normally if you want to match some text and then leave it alone in the substitution, you have to capture it and then put it back manually; \zs lets you avoid this.

Say you want to chop some text off the end of a line, but leave the rest of the line alone. Normally you'd have to do this:

:%s/\v^(foobar)(baz)/\1/

to put the foobar back. Of course you can also use a zero-width lookbehind assertion:

:%s/\v(^foobar)@<=baz//

But that's even more line-noise. This is the easiest way:

:%s/^foobar\zsbaz//

See :h /\zs. (And :h /\@<= if you're so inclined.)

Expressions

Using \=, you can put arbitrary expressions on the right side of a regex substitution. For example say you have this text:

~/foo ~/bar

If you do this:

:%s/\v(\S+)/\=expand(submatch(1))/g

You end up with:

/home/user/foo /home/user/bar

Because you can also call your own user-defined functions in the expression part, this can end up being pretty powerful. For example it can be used to insert incrementing numbers into arbitrary places in your text. See :h sub-replace-\=.

And so on

Read :h regexp if you haven't already. Tons of other features in there that can make your life easy if you manage to internalize them. It is difficult to get used to Vim's funky syntax if you're very familiar with Perl/Ruby-style regexes, but I think it's worth it. Only took me two years! (OK, more like a couple days of concerted effort after a year-and-a-half delay.)

6 Comments

http://gravatar.com/avatar/44998941599f0da5effd741d9b00e3f8.jpg?d=identicon
Mats Rauhala says:

Also you should not use regular expressions if there are better solutions available. I'm quite fluent with regular expressions but many times it's faster to do macros instead of regular expressions.

Apr 19, 2009 12:20 AM PST
http://gravatar.com/avatar/4d84ec3981443dfd9c287e845b60d2ce.jpg?d=identicon
Brian says:

You can't escape regexes in Vim; they're used for searching too, and I probably use :g more than anything else. :g + :norm is pretty useful.

Macros are handy too but I find them to be a bit fragile for anything complicated. The good thing about :s and :g is the commandline has a history and a dedicated area to edit and play with your command until it works. You could manually edit a macro as text and slurp it into a register and run it that way, but it's clumsy. Recording a complicated macro takes a steady hand, and sometimes a bit of cleverness to make sure your cursor begins and ends up in a good place to run the macro many times in a row.

Apr 19, 2009 01:14 AM PST
http://gravatar.com/avatar/5d40796614ed1c5b73144d7f1fa5ecba.jpg?d=identicon
Sam Stokes says:

I found this article really helpful - lots of things I didn't know here. I've somehow missed "very magic" despite reading through that part of the docs several times - instead just got used to \typing \backslashes \before \everything. And I'm sure I'll find a use for \= before the week is up.

A related trick I find useful is @: meaning "redo the previous command line". If you've just done a :s/foo/bar/ and want to reapply it on several other lines, just move to those lines and @:.

Apr 21, 2009 09:32 AM PST
http://gravatar.com/avatar/d41d8cd98f00b204e9800998ecf8427e.jpg?d=identicon
Leon says:

Is there an easy way to test a regex using '/' and then use it in a :g or :s expression?

Nov 12, 2009 02:51 PM PST
http://gravatar.com/avatar/4d84ec3981443dfd9c287e845b60d2ce.jpg?d=identicon
Brian says:

Yep, after you do a search, the last search pattern is stored in the "/ register. Type :g/^R/ where ^R is Ctrl+R, and / is the name of the register you want.

Nov 12, 2009 04:26 PM PST
http://gravatar.com/avatar/bd15c85e76b43ea9829455756037feac.jpg?d=identicon
Ali says:

I think both regular expressions and macros have their place. When a solution will be much better served with a regular expression + captures then I think its quite obvious.

While Macros do require a steady hand, once you get used to banging them out inline they make your life a lot easier. Also, very big wow-factor for anyone watching!

Mar 10, 2010 03:52 AM PST

Speak Your Mind

This says COWS.

Preview

Commenting Help

Email / Avatar

  • Supply your email address and your Gravatar will be used.
  • I will never email you and your address won't be published.

No HTML allowed!

All HTML is auto-escaped. Use Markdown. Examples:

  • *emphasis* = emphasis
  • **strong** = strong
  • [link](http://foo.bar) = <a href="http://foo.bar">link</a>
  • `code in backticks` = code in backticks
  •     code indented 4 spaces =
    code indented four spaces
  • > Angle-brace quoted text =
    Angle-brace quoted text