Guidelines

Injection - Path Traversal

Path Traversal is another pretty common type of injection vulnerability. They tend to happen when the construction of a URI (be it for a URL, file path, or otherwise) doesn’t properly ensure that the fully resolved path isn’t pointing outside the root of the intended path. 

It's important to call out that path traversal could also be seen as in fact being a path *injection* vulnerability. 

The impact of a path traversal vulnerability heavily depends on the context of where the traversal occurs, and the overall hardening that’s been done. But before we get into that, let’s go through a quick practical example of this vulnerability to see what we’re talking about:                                                                                                        

A quick breakdown

Consider an endpoint in your application that serves up documents, like templates for contracts or job offers. These could all be files, like PDFs, that are static in your application. 

In this situation, you might have a piece of code like this to fetch the files on request:

let baseFolder = "/var/www/api/documents/"; 
let path = baseFolder + request.params.filename;

return file.read(path);

In order to demonstrate how the vulnerability plays out, we also have to know where the root of our application is so, for this example, assume the application's root is at ‘/var/www/api/’. 

We know the application takes a ‘filename’ parameter, let's look at a few examples of inputs, and what the result is:

Filename Unresolved path Resolved path
Privacy.pdf /var/www/api/documents/Privacy.pdf /var/www/api/documents/Privacy.pdf
../config/prod.config /var/www/api/documents/../config/prod.config /var/www/api/config/prod.config
../../../../etc/shadow /var/www/api/documents/../../../../etc/shadow /etc/shadow

Notice how we’re able to traverse the file system using ‘../’. We're able to move out of the ‘documents’ folder where the PDFs usually live and into the ‘/etc/’ folder containing the ‘shadow’ file, which on Linux, contains password hashes. As you can imagine, that’s really not ideal. 

Looking at Traversal in Urls

Another variant of path traversal can occur when constructing URLs that are intended to interact with an API. Assume we have an API with following methods:

URL pattern Description
/api/v1/order/get/{id} Gets details about the order with the specified ID
/api/v1/order/delete/{id} Deletes an order with the specific ID

The API is interacted with by another application that might call it, say, when trying to get information about an order:

let apiBase = "https://my.api/api/v1"; 
let orderApi = apiBase + "/order/get";

let apiUrl = orderApi + request.params.orderId;

let response = http.get(apiUrl);

What now happens, depending on the order ID provided by the user? Below you see the effective URL invoked based on the input provided. 

The canonicalization is usually not done on the client side (though it can be), but web servers will canonicalize the request into the format seen below.

Order ID number Actual URL invoked
1 /api/v1/order/get/1
1/../../delete/1 /api/v1/order/delete/1

With the input of the second example, rather than fetching the order with ID number ‘1, we’ve  actually invoke the delete method instead, which of course results in deleting the order.

Mitigations

When discussing path traversal, there are both direct mitigations along with indirect/defense techniques that can, and should, be applied as often as possible. First, let’s look at how to handle paths.

Direct mitigation

When it comes to handling a path, we have to understand the process of path resolution, or path canonicalization, and its importance. 

When you have a path like ‘/var/www/api/documents/../../../../etc/shadow’, it’s in a non-canonical path. If you request this path from your file system, it will canonicalize it to ‘/etc/shadow’. It's critical that you don’t try to open non-canonical paths. Rather, you should canonicalize paths first, verify that they’re pointing only to the intended file or folder, and then read it. 

let baseFolder = "/var/www/api/documents/"; 
let path = baseFolder + request.params.filename;

let resolvedPath = path.resolve(path);

if(!resolvedPath.startswith(baseFolder))
    return "Tried to read outside of base folder";
else
    return file.read(resolvedPath);

Anti-pattern - Trying to sanitize filenames

It may be tempting to do something like this:


let baseFolder = "/var/www/api/documents/"; 
let path = baseFolder + request.params.filename.replace("../", "");
...

However, this approach should not be used. The key in handling paths is to always look at the canonical path. 

As long as the canonical path isn’t breaking any rules, how the path is ultimately constructed doesn’t really make any difference. Trying to sanitize a path like this is very error-prone and rarely secure, if ever.

Limit access

In our previous examples, we've used the reading of the ‘/etc/shadow’ file, which is the file with password hashes on Linux. But there's really no reason an application should be able to read that file, or other files, outside of its root.

If you employ containers, you’re likely already mitigating a lot of risks. Taking steps to harden the container (don't run as root, and such) is vital. Dropping all privileges from your web process is strongly recommended, and limiting its read permissions on the file system to only those files it strictly needs. 

Examples

Now we’ll share a few examples in various languages to help demonstrate things a little better while they’re in action.

C# - Insecure

By not resolving the full path, or ensuring that you only use the file name part of a path, it leaves the code vulnerable to Path Traversal. 

var baseFolder = "/var/www/app/documents/";
var fileName = "../../../../../etc/passwd";

// INSECURE: Reads /etc/passwd
var fileContents = File.ReadAllText(Path.Combine(baseFolder, fileName));

C# - Secure - canonical

In this example, we protect against Path Traversal by resolving the full (absolute) path, and ensuring that the file resolved path is within our base folder. 

var baseFolder = "/var/www/app/documents/";
var fileName = "../../../../../etc/passwd";

var canonicalPath = Path.GetFullPath(Path.Combine(baseFolder, fileName));

// SECURE: Rejects any attempt at reading outside of specified base.
if(!canonicalPath.StartsWith(baseFolder))
    return "Trying to read file outside of base folder";

var fileContents = File.ReadAllText(canonicalPath);

C# - Secure - filename

In this example, we protect against Path Traversal by taking only the file name part of the path, ensuring that it's impossible to traverse out of the folder specified. 

var baseFolder = "/var/www/app/documents/";

// Only use this if you don't allow navigating into other subfolders
var fileName = Path.GetFileName("../../../../../etc/passwd");

// SECURE: Reads /var/www/app/documents/passwd
var fileContents = File.ReadAllText(Path.Combine(baseFolder, fileName));

Java - Insecure

By not resolving the full path, or ensuring that you only use the file name part of a path, it leaves the code vulnerable to Path Traversal. 

String baseFolder = "/var/www/app/documents/";
String fileName = "../../../../../etc/passwd";

// INSECURE: Reads /etc/passwd
Path filePath = Paths.get(baseFolder + fileName);
List<String> lines = Files.readAllLines(filePath);

Java - Secure - Canonical

In this example, we protect against Path Traversal by resolving the full (absolute) path, and ensuring that the file resolved path is within our base folder. 

String baseFolder = "/var/www/app/documents/";
String fileName = "../../../../../etc/passwd";

// INSECURE: Reads /etc/passwd
Path normalizedPath  = Paths.get(baseFolder + fileName).normalize();
if(!normalizedPath.toString().startsWith(baseFolder))
{
    return "Trying to read path outside of root";
}
else
{
    List<String> lines = Files.readAllLines(normalizedPath);
}

Java - Secure - Filename

In this example, we protect against Path Traversal by taking only the file name part of the path, ensuring that it's impossible to traverse out of the folder specified. 

String baseFolder = "/var/www/app/documents/";

// Only use this if you don't allow navigating into other subfolders
String fileName = Paths.get("../../../../../etc/passwd").getFileName().toString();

// SECURE: Reads /var/www/app/documents/passwd
Path filePath = Paths.get(baseFolder + fileName);
List<String> lines = Files.readAllLines(filePath);

Javascript -  Insecure

By not resolving the full path, or ensuring that you only use the file name part of a path, it leaves the code vulnerable to Path Traversal. 

const fs = require('fs');

const baseFolder = "/var/www/app/documents/";
const fileName = "../../../../../etc/passwd";

// INSECURE: Reads /etc/passwd
const data = fs.readFileSync(baseFolder + fileName, 'utf8');

Javascript - Secure - Canonical

In this example, we protect against Path Traversal by resolving the full (absolute) path, and ensuring that the file resolved path is within our base folder. 

const fs = require("fs");
const path  = require("path");

const baseFolder = "/var/www/app/documents/";
const fileName = "../../../../../etc/passwd";

const normalizedPath = path.normalize(path.join(baseFolder, fileName));

// SECURE: Reads /var/www/app/documents/passwd
const data = fs.readFileSync(normalizedPath, 'utf8');

Javascript - Secure - Filename

In this example, we protect against Path Traversal by taking only the file name part of the path, ensuring that it's impossible to traverse out of the folder specified. 

const fs = require("fs");
const path  = require("path");

const baseFolder = "/var/www/app/documents/";
const fileName = path.basename("../../../../../etc/passwd");

// SECURE: Reads /var/www/app/documents/passwd
const data = fs.readFileSync(path.join(baseFolder, fileName), 'utf8');

Python - Insecure

By not resolving the full path, or ensuring that you only use the file name part of a path, it leaves the code vulnerable to Path Traversal. 

baseFolder = "/var/www/app/documents/"
fileName = "../../../../../etc/passwd"

# INSECURE: Reads /etc/passwd
fileContents = open(baseFolder + fileName).read()

Python - Secure - Canonical

In this example, we protect against Path Traversal by resolving the full (absolute) path, and ensuring that the file resolved path is within our base folder. 

import os.path

baseFolder = "/var/www/app/documents/"
fileName = "../../../../../etc/passwd"

normalizedPath =  os.path.normpath(baseFolder + fileName)

# SECURE: Rejects any attempt to read files outside of the specified base folder
if not normalizedPath.startswith(baseFolder):
    return "Trying to read out of base folder"

# SECURE: Reads /var/www/app/documents/passwd
fileContents = open(normalizedPath).read()

Python - Secure - Filename

In this example, we protect against Path Traversal by taking only the file name part of the path, ensuring that it's impossible to traverse out of the folder specified. 

import os.path

baseFolder = "/var/www/app/documents/"
fileName = os.path.basename("../../../../../etc/passwd")

# SECURE: Reads /var/www/app/documents/passwd
fileContents = open(os.path.join(baseFolder,  fileName)).read()