13 Sep 2004 bolsh   » (Master)

Sed tip

A friend was recently looking for help with a problem, and that gave me the chance to freshen up my sed a bit.

His problem was the not uncommon one of wanting to replace a two-word string which might go over line-breaks. He wanted to keep the newlines in the replacement.

Most people are unaware that sed can do multiline edition with the commands N, D, P, G and H. Here's the script which did the job in the end (assuming I want to replace "I hope" with "we expect"):

sed -e '
N
s/\<I\([ \n]\{1,\}\)hope\>/we\1expect/g
P
D' input.txt > output.txt

N adds a new input line to the pattern buffer, the s replaces I hope in that pattern buffer, saving the thing which separates I and hope (spaces or a newline), and inserts it between the we and think in the output, P prints up to the first newline, and finally D deletes up to the first newline, and forks back to the N if there's anything left in the patter buffer. The last line, which is the active pattern buffer, gets dumped when the N command fails (that is, when there is no more input).

I'd forgotten how much sed rocks.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!