FILTER_FLAG_STRIP_LOW vs FILTER_FLAG_STRIP_HIGH?

Mohammad Naji picture Mohammad Naji · Oct 7, 2012 · Viewed 16k times · Source

In reference to the filter_var function in PHP 5:

I have visited its documentation at: http://php.net/manual/en/filter.filters.sanitize.php, but I still have this question:

What are the exact differences?

For simpler clarification, please provide an example.

Answer

phihag picture phihag · Oct 7, 2012

The flags are explained in a different page of the documentation.

FILTER_FLAG_STRIP_LOW strips bytes in the input that have a numerical value <32, most notably null bytes and other control characters such as the ASCII bell. This is a good idea if you intend to pass an input to another application which uses null-terminated strings. In general, characters with a Unicode codepoint lower than 32 should not occur in user input, except for the newline characters 10 and 13.

FILTER_FLAG_STRIP_HIGH strips bytes in the input that have a numerical value >127. In almost every encoding, those bytes represent non-ASCII characters such as ä, ¿, etc. Passing this flag can be a band-aid for broken string encoding, which can become a security vulnerability. However, non-ASCII characters are to be expected in virtually all user input.

To summarize:

filter_var("\0aä\x80", FILTER_SANITIZE_STRING) == "\0aä\x80"
filter_var("\0aä\x80", FILTER_SANITIZE_STRING, FILTER_FLAG_STRIP_LOW) == "aä\x80"
filter_var("\0aä\x80", FILTER_SANITIZE_STRING, FILTER_FLAG_STRIP_HIGH) == "\0a"
filter_var("\0aä\x80", FILTER_SANITIZE_STRING,
           FILTER_FLAG_STRIP_LOW | FILTER_FLAG_STRIP_HIGH) == "a"