A command-line HTML pretty-printer: Making messy HTML readable

knorv picture knorv · Feb 3, 2010 · Viewed 59.2k times · Source

I'm looking for recommendations for HTML pretty printers which fulfill the following requirements:

  • Takes HTML as input, and then output a nicely formatted/correctly indented but "graphically equivalent" version of the given input HTML.
  • Must support command-line operation.
  • Must be open-source and run under Linux.

Answer

jonjbar picture jonjbar · Feb 3, 2010

Have a look at the HTML Tidy Project: http://www.html-tidy.org/

The granddaddy of HTML tools, with support for modern standards.

There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository.

Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.

For your needs, here is the command line to call Tidy:

tidy inputfile.html