Regarding the Unix command "wc" what is considered as a word?

maritaslag picture maritaslag · Feb 15, 2018 · Viewed 6.9k times · Source

The command wc provides lineCount, wordCount, and charCount. I am writing a program that simulates the wc command as it takes a file and spits out the 3 properties. Line count is easy because if it sees \n it will ++lineCount and if a char exists and it's not EOF, it will ++charCount. But what does word mean? What separates words, whitespace?

Answer

Keith Thompson picture Keith Thompson · Feb 15, 2018

This is specified by POSIX:

The wc utility shall consider a word to be a non-zero-length string of characters delimited by white space.

The man page for wc on my system (Ubuntu 17.04) is similar:

A word is a non-zero-length sequence of characters delimited by white space.