Monday, July 11, 2011

Search-and-replace on multiple files with sed

The first and most important thing developers are is lazy.  I hate rewriting code I know I've written before, unless I'm optimizing to make it better the second time, but usually most code works just fine and I'd rather optimize something more interesting anyway.

One thing I do quite often is take code I wrote for one client, search through and change the names or nomenclature and resell it to another client.  It's an easy way to demo a product with their name on it to make a sale, since it takes less than an hour to change the names and integrate some new images.  To do this I have to do a search-and-replace on all files in a directory tree.  Here are a few ways I do that:

From the Linux command-line, you can use find to get all the files to be changed, and sed to make the changes.  For example:

$ find ./ -type f -exec sed -i 's/string1/string2/g' {} \;

Let's break it down:
Find all the items in the current directory and subdirectories that are regular files.
On each file, execute the following command:
sed -i 's/string1/string2/g' {} ;
-i means operate on the file in place instead of echoing the result to stdout.
The regular expression enclosed in in single quotes is what it executes.  If you're unfamiliar with regex, the one used here basically says "Substitute string1 for string2, and Go to the end of every line instead of stopping at the first match you find."  Using more complicated regex you can do more complicated matching.
{} is replaced with each filename from find.

So you can simply replace string1 what the text you want to replace, string2 with what you want it to change to, and let 'er rip.  Be careful to do it in the right directory or you'll be sorry, I've occasionally done it by accident in the parent directory and changed more than I planned...


But I don't want to change everything!

You may not always want to change every file, though.  Suppose you are using a source control tool like subversion, and want to do a search and replace like this on all the files in your working copy.  The problem is that subversion uses hidden directories called .svn to store its own files, including those is uses to keep track of when files have changed.  If you use the above command and change the original files as well as subversion's historical record, subversion won't know anything has changed!  You won't be able to check in your changes or revert, and you'll end up having to export to another directory, delete your working copy and check it out again, then copy your exported copy back in, which is way too much work and likely to cause you more grief than it's worth.  Instead, find gives you the ability to exclude some files from the search so you don't have to change them all:

$ find ./  -path '*/.svn' -prune -o -type f -exec sed -i 's/string1/string2/g' {} \;

We've added a bit there in the middle:
-path '*/.svn' -prune -o
That says find every directory path named .svn and prune it out; do not descend into it for more files.  The -o means OR, it means do what is before it OR what is after it, but do not do both.  So if the prune occurs it knows not to try to match files, if the prune does not occur it proceeds normally as above.

Obviously this command-line can be used for cvs directories or others just by changing the '*.svn' directory it is pruning to what you need it to be.

So there you go, this has saved me more hours than almost any other trick I use.  "But my favorite IDE can do this from a menu option" you might say.  Well that's fine, but with this you can do it over ssh, from the command line, in any GNU/Linux environment that has find and sed (which is basically all of them).  In my opinion anything you can do from the shell is better, even if you wrap it in a GUI later.

It's a little more complicated if you have an old version of sed that doesn't support the -i option (that would be more than six years old, now), but in that case you're probably used to doing things the hard way, anyway.

No comments:

Post a Comment