What are the differences between Perl, Python, AWK and sed?

Khaled Al Hourani picture Khaled Al Hourani · Dec 14, 2008 · Viewed 79.6k times · Source

just want to know what are the main differences among them? and the power of each language (where it's better to use it).

Edit: it's not "vs." like topic, just information.

Answer

Jonathan Leffler picture Jonathan Leffler · Dec 14, 2008

In order of appearance, the languages are sed, awk, perl, python.

The sed program is a stream editor and is designed to apply the actions from a script to each line (or, more generally, to specified ranges of lines) of the input file or files. Its language is based on ed, the Unix editor, and although it has conditionals and so on, it is hard to work with for complex tasks. You can work minor miracles with it - but at a cost to the hair on your head. However, it is probably the fastest of the programs when attempting tasks within its remit. (It has the least powerful regular expressions of the programs discussed - adequate for many purposes, but certainly not PCRE - Perl-Compatible Regular Expressions)

The awk program (name from the initials of its authors - Aho, Weinberger, and Kernighan) is a tool initially for formatting reports. It can be used as a souped-up sed; in its more recent versions, it is computationally complete. It uses an interesting idea - the program is based on 'patterns matched' and 'actions taken when the pattern matches'. The patterns are fairly powerful (Extended Regular Expressions). The language for the actions is similar to C. One of the key features of awk is that it splits the input automatically into records and each record into fields.

Perl was written in part as an awk-killer and sed-killer. Two of the programs provided with it are a2p and s2p for converting awk scripts and sed scripts into Perl. Perl is one of the earliest of the next generation of scripting languages (Tcl/Tk can probably claim primacy). It has powerful integrated regular expression handling with a vastly more powerful language. It provides access to almost all system calls and has the extensibility of the CPAN modules. (Neither awk nor sed is extensible.) One of Perl's mottos is "TMTOWTDI - There's more than one way to do it" (pronounced "tim-toady"). Perl has 'objects', but it is more of an add-on than a fundamental part of the language.

Python was written last, and probably in part as a reaction to Perl. It has some interesting syntactic ideas (indenting to indicate levels - no braces or equivalents). It is more fundamentally object-oriented than Perl; it is just as extensible as Perl.

OK - when to use each?

  • Sed - when you need to do simple text transforms on files.
  • Awk - when you only need simple formatting and summarisation or transformation of data.
  • Perl - for almost any task, but especially when the task needs complex regular expressions.
  • Python - for the same tasks that you could use Perl for.

I'm not aware of anything that Perl can do that Python can't, nor vice versa. The choice between the two would depend on other factors. I learned Perl before there was a Python, so I tend to use it. Python has less accreted syntax and is generally somewhat simpler to learn. Perl 6, when it becomes available, will be a fascinating development.

(Note that the 'overviews' of Perl and Python, in particular, are woefully incomplete; whole books could be written on the topic.)