Is there a way to delete duplicate lines in a file in Unix?
I can do it with sort -u
and uniq
commands, but I want to use sed
or awk
.
Is that possible?
awk '!seen[$0]++' file.txt
seen
is an associative-array that Awk will pass every line of the file to. If a line isn't in the array then seen[$0]
will evaluate to false. The !
is the logical NOT operator and will invert the false to true. Awk will print the lines where the expression evaluates to true. The ++
increments seen
so that seen[$0] == 1
after the first time a line is found and then seen[$0] == 2
, and so on.
Awk evaluates everything but 0
and ""
(empty string) to true. If a duplicate line is placed in seen
then !seen[$0]
will evaluate to false and the line will not be written to the output.