In the responses to the question Reading In A String and comparing it C, more than one person discouraged the use of strcmp()
, saying things like
I also strongly, strongly advise you to get used to using strncmp() now, ... to avoid many problems down the road.
or (in Why does my string comparison fail? )
Make certain you use strncmp and not strcmp. strcmp is profoundly unsafe.
What problems are they alluding to?
The reason scanf()
with string specifiers and gets()
are strongly discouraged is because they almost inevitably lead to buffer overflow vulnerabilities. However, it's not possible to overflow a buffer with strcmp()
, right?
"A buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory."
( -- Wikipedia: buffer overflow).
Since the strcmp() function never writes to any buffer, the strcmp() function cannot cause a buffer overflow, right?
What is the reason people discourage the use of strcmp()
, and recommend strncmp()
instead?
While strncmp
can prevent you from overrunning a buffer, its primary purpose isn't for safety. Rather, it exists for the case where one wants to compare only the first N characters of a (properly possibly NUL-terminated) string.
From the man page:
The
strcmp()
function compares the two stringss1
ands2
. It returns an integer less than, equal to, or greater than zero ifs1
is found, respectively, to be less than, to match, or be greater thans2
.The
strncmp()
function is similar, except it compares the only first (at most)n
bytes ofs1
ands2
.
Note that strncmp
in this case cannot be replaced with a simple memcmp
, because you still need to take advantage of its stop-on-NUL behavior, in case one of the strings is shorter than n
.
If strcmp
causes a buffer overrun, then one of two things is true:
memcmp
instead.Note that reading past the end of a buffer is still considered a buffer overrun. While it may seem harmless, it can be just as dangerous as writing past the end.
Reading, writing, executing... it doesn't matter. Any memory reference to an unintended address is undefined behavior. In the most apparent scenario, you attempt to access a page that isn't mapped into your process's address space, causing a page fault, and subsequent SIGSEGV. In the worst case, you sometimes run into a \0 byte, but other times you run into some other buffer, causing inconstant program behavior.