Filtering upwards path traversal in Java (or Scala)

matanster picture matanster · Oct 12, 2015 · Viewed 8.6k times · Source

Are there any standard library methods that can filter out paths which include special traversal sequences, such as ../ and all other convoluted forms of upwards directory traversal, to safeguard a file path API input from traversing upwards of a given "root" path?

I have a class that contains a root folder value member, and a member function that accepts paths to recursively delete. My goal is to make this API safe, in filtering out any input path provided to it - which would translate to a path upwards of the root folder value. The aim is that this class would be liberally used to delete files under the root path, but it would never touch anything upwards of the root path.

This is similar to the broader path traversal attack.

Methods that are too restrictive (i.e. may result in false negatives) may be fine for my specific use case, if this simplifies things, and also, my current needs are for file system paths not web ones (although, a web module for the equivalent sake might theoretically work here).

Answer

Wyzard picture Wyzard · Oct 12, 2015

You can use Path.normalize() to strip out ".." elements (and their preceding elements) from a path — e.g. it'll turn "a/b/../c" into "a/c". Note that it won't strip out a ".." at the beginning of a path, since there's no preceding directory component for it to remove as well. So if you're going to prepend another path, do that first, then normalize the result.

You can also use Path.startsWith(Path) to check whether one path is a descendant of another. And Path.isAbsolute() tells you, unsurprisingly, whether a path is absolute or relative.

Here's how I'd process the untrusted paths coming into the API:

/**
 * Resolves an untrusted user-specified path against the API's base directory.
 * Paths that try to escape the base directory are rejected.
 *
 * @param baseDirPath  the absolute path of the base directory that all
                     user-specified paths should be within
 * @param userPath  the untrusted path provided by the API user, expected to be
                  relative to {@code baseDirPath}
 */
public Path resolvePath(final Path baseDirPath, final Path userPath) {
  if (!baseDirPath.isAbsolute()) {
    throw new IllegalArgumentException("Base path must be absolute");
  }

  if (userPath.isAbsolute()) {
    throw new IllegalArgumentException("User path must be relative");
  }

  // Join the two paths together, then normalize so that any ".." elements
  // in the userPath can remove parts of baseDirPath.
  // (e.g. "/foo/bar/baz" + "../attack" -> "/foo/bar/attack")
  final Path resolvedPath = baseDirPath.resolve(userPath).normalize();

  // Make sure the resulting path is still within the required directory.
  // (In the example above, "/foo/bar/attack" is not.)
  if (!resolvedPath.startsWith(baseDirPath)) {
    throw new IllegalArgumentException("User path escapes the base path");
  }

  return resolvedPath;
}