This is the command I'm using on a standard web page I wget
from a web site.
tr '<' '\n<' < index.html
however it giving me newlines, but not adding the left broket in again. e.g.
echo "<hello><world>" | tr '<' '\n<'
returns
(blank line which is fine)
hello>
world>
instead of
(blank line or not)
<hello>
<world>
What's wrong?
That's because tr
only does character-for-character substitution (or deletion).
Try sed
instead.
echo '<hello><world>' | sed -e 's/</\n&/g'
Or awk
.
echo '<hello><world>' | awk '{gsub(/</,"\n<",$0)}1'
Or perl
.
echo '<hello><world>' | perl -pe 's/</\n</g'
Or ruby
.
echo '<hello><world>' | ruby -pe '$_.gsub!(/</,"\n<")'
Or python
.
echo '<hello><world>' \
| python -c 'for l in __import__("fileinput").input():print l.replace("<","\n<")'