filter_input() $_SERVER["REQUEST_URI"] with FILTER_SANITIZE_URL

Kibele picture Kibele · Jun 26, 2014 · Viewed 9.5k times · Source

I'm filtering $_SERVER["REQUEST_URI"] such that:

$_request_uri = filter_input(INPUT_SERVER, 'REQUEST_URI', FILTER_SANITIZE_URL);

As explained in php.net:

FILTER_SANITIZE_URL

Remove all characters except letters, digits and $-_.+!*'(),{}|\^~[]`<>#%";/?:@&=.

However,

the browser sends this REQUEST_URI value urlencode'd and therefore it is not sanitized in this filter_input() function. Say the address is

http://www.example.com/abc/index.php?q=abc��123

and then the sanitized request url is

/abc/index.php?q=abc%EF%BF%BD%EF%BF%BD123

But it should be

/abc/index.php?q=abc123

It is possible urldecode($_SERVER["REQUEST_URI"]) and then using filter_var() we can get a sanitized value.

$_request_uri = filter_var(urldecode($_SERVER['REQUEST_URI']), FILTER_SANITIZE_URL);

I don't know why the last one seems to me "inelegant" and I'm looking for an elegant way, sanitizing $_SERVER["REQUEST_URI"].

Maybe, accessing a super global array directly ($_SERVER['REQUEST_URI']) while coding disturbs me, thus "inelegant".

Is there an elegant way?

Answer

Nalin Singapuri picture Nalin Singapuri · Nov 20, 2014

I think you could use either mod_rewrite or apaches SetEnv directive to undecode the url server side. This would have the effect of changing the REQUEST_URI in apache and consequently the value of $_SERVER["REQUEST_URI"] in php.

I dont like this solution, and you likely dont want to do this. The issues I see:

  • it does not allow for multiple get parameters which may have different validation rules.
  • it allows for arbitrary parameters.
  • it requires permissions which a user may not have and changes default server behavior.
  • mod_rewrite is seldom a good solution.

A good solution which avoids the global is to call filter_input or filter_input_array on INPUT_GET (instead of INPUT_SERVER).

$urlParameters = http_build_query(
    filter_input_array(
        INPUT_GET,
        FILTER_SANITIZE_URL
    )
);

$_request_uri = filter_input(INPUT_SERVER, 'SCRIPT_URL', FILTER_SANITIZE_URL). ($urlParameters ? "?{$urlParameters}" : "");
print_r($_request_uri);

A better solution would be to whitelist specific parameters and use specific rules for validation, and to use these parameters directly (avoiding setting and parsing $_request_uri)

$_request_parameters = filter_input_array(
    INPUT_GET,
    array(
        'q' => FILTER_SANITIZE_URL,
    )
);

print_r($_request_parameters['q']);