sed
is a stream editor. It works with streams of characters on a per-line basis. It has a primitive programming language that includes goto-style loops and simple conditionals (in addition to pattern matching and address matching). There are essentially only two "variables": pattern space and hold space. Readability of scripts can be difficult. Mathematical operations are extraordinarily awkward at best.
There are various versions of sed
with different levels of support for command line options and language features.
awk
is oriented toward delimited fields on a per-line basis. It has much more robust programming constructs including if
/else
, while
, do
/while
and for
(C-style and array iteration). There is complete support for variables and single-dimension associative arrays plus (IMO) kludgey multi-dimension arrays. Mathematical operations resemble those in C. It has printf
and functions. The "K" in "AWK" stands for "Kernighan" as in "Kernighan and Ritchie" of the book "C Programming Language" fame (not to forget Aho and Weinberger). One could conceivably write a detector of academic plagiarism using awk
.
GNU awk
(gawk
) has numerous extensions, including true multidimensional arrays in the latest version. There are other variations of awk
including mawk
and nawk
.
Both programs use regular expressions for selecting and processing text.
I would tend to use sed
where there are patterns in the text. For example, you could replace all the negative numbers in some text that are in the form "minus-sign followed by a sequence of digits" (e.g. "-231.45") with the "accountant's brackets" form (e.g. "(231.45)") using this (which has room for improvement):
sed 's/-\([0-9.]\+\)/(\1)/g' inputfile
I would use awk
when the text looks more like rows and columns or, as awk
refers to them "records" and "fields". If I was going to do a similar operation as above, but only on the third field in a simple comma delimited file I might do something like:
awk -F, 'BEGIN {OFS = ","} {gsub("-([0-9.]+)", "(" substr($3, 2) ")", $3); print}' inputfile
Of course those are just very simple examples that don't illustrate the full range of capabilities that each has to offer.