How split a file in words in unix command line?

unix command-line awk tokenize

jaundavid · Mar 19, 2013 · Viewed 34.4k times · Source

I'm doing a faster tests for a naive boolean information retrival system, and I would like use awk, grep, egrep, sed or thing similiar and pipes for split a text file into words and save them into other file with a word per line. Example my file cotains:

Hola mundo, hablo español y no sé si escribí bien la
pregunta, ojalá me puedan entender y ayudar
Adiós.

The output file should contain:

Hola
mundo
hablo
español
...

Thank!

Answer

Using tr:

tr -s '[[:punct:][:space:]]' '\n' < file

How split a file in words in unix command line?

Answer

Related questions