raTM
raTM

Reputation: 399

Can't seem to get ESAPI Validator getValidInput() Working for URL Parameters

I am trying to use ESAPI Encoder to identify and canonicalize URL-encoded query parameters. It sort of works, but not in the way the API seems to indicate. Here is my class, and below is the output it generates:

CODE

package test.test;

import org.owasp.esapi.ESAPI;
import org.owasp.esapi.Validator;
import org.owasp.esapi.errors.EncodingException;
import org.owasp.esapi.errors.IntrusionException;
import org.owasp.esapi.errors.ValidationException;

public class ESAPITester {

    public static void main(String argsp[]) throws ValidationException, 
    IntrusionException, EncodingException {

        String searchString = "-/+=_ !$*?@";
        String singleEncoded = ESAPI.encoder().encodeForURL(searchString);
        String doubleEncoded = ESAPI.encoder().encodeForURL(singleEncoded);
        Validator validator = ESAPI.validator();
        System.out.println("Searched        : " + searchString);
        System.out.println("Single encoded  : " + singleEncoded);
        System.out.println("Double encoded  : " + doubleEncoded);
        System.out.println("Decode from URL : " + ESAPI.encoder().decodeFromURL(singleEncoded));
        System.out.println("Canonicalized   : " + ESAPI.encoder().canonicalize(singleEncoded));
        System.out.println("Valid input     : " + validator.getValidInput("http", 
                searchString, "HTTPParameterValue", 100, true, true));
        System.out.println("Valid from Encoded : " + validator.getValidInput("http", 
                singleEncoded, "HTTPParameterValue", 100, true, true));

    }
}

OUTPUT

Searched        : -/+=_ !$*?@
Single encoded  : -%2F%2B%3D_+%21%24*%3F%40
Double encoded  : -%252F%252B%253D_%2B%2521%2524*%253F%2540
Decode from URL : -/ =_ !$*?@
Canonicalized   : -/+=_+!$*?@
Valid input     : -/+=_ !$*?@
log4j:WARN No appenders could be found for logger (IntrusionDetector).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" org.owasp.esapi.errors.ValidationException: http: Invalid input. Please conform to regex ^[\p{L}\p{N}.\-/+=_ !$*?@]{0,1000}$ with a maximum length of 100
    at org.owasp.esapi.reference.validation.StringValidationRule.checkWhitelist(StringValidationRule.java:144)
    at org.owasp.esapi.reference.validation.StringValidationRule.checkWhitelist(StringValidationRule.java:160)
    at org.owasp.esapi.reference.validation.StringValidationRule.getValid(StringValidationRule.java:284)
    at org.owasp.esapi.reference.DefaultValidator.getValidInput(DefaultValidator.java:214)
    at test.test.ESAPITester.main(ESAPITester.java:25)

My question is: Why does the getValidInput() not canonicalize the URL-encoded input parameter? I'm curious as to why the canonicalize() method does so, but getValidInput() with the final argument ('canonicalize') set to true doesn't.

Upvotes: 0

Views: 11632

Answers (1)

avgvstvs
avgvstvs

Reputation: 6325

So the question becomes:

why the 2nd validator.getValidInput() call throws an exception, when all it is expected to do is to canonicalize the input and validate that it matches the expected value. In other words, the direct call to canonicalize() works, but the call to getValidInput() fails.

Something is very wrong here. In the version of HTTPParameterValue that you get from the OWASP source repo, the regex is ^[a-zA-Z0-9.\\-\\/+=@_ ]*$ Someone has manipulated the HTTPParameterValue to look more like SafeString: ^[\\s\\p{L}\\p{N}.]{0,1024}$

See line 440.

This is wrong. Changing default ESAPI values shouldn't be done, if you need custom changes, write a brand new validator.properties entry using the established pattern.

Your test will still fail however, because the string decodes to -/+=_ !$*?@ and ? is a reserved character within http queries.

From an earlier spec:

3.4. Query Component

The query component is a string of information to be interpreted by the resource.

  query         = *uric

Within a query component, the characters ";", "/", "?", ":", "@",
"&", "=", "+", ",", and "$" are reserved.

As to why the input fails according to the regex you're running at, ^[\\p{L}\\p{N}.\\-/+=_ !$*?@]{0,1000}$, read the code. At line 266 you'll see the affected method.

Here's what you want to look at:

public String getValid( String context, String input ) throws ValidationException
    {
        String data = null;

        // checks on input itself

        // check for empty/null
        if(checkEmpty(context, input) == null)
            return null;

        if (validateInputAndCanonical)
        {
            //first validate pre-canonicalized data

            // check length
            checkLength(context, input);

            // check whitelist patterns
            checkWhitelist(context, input);

            // check blacklist patterns
            checkBlacklist(context, input);

            // canonicalize
            data = encoder.canonicalize( input );

        } else {

            //skip canonicalization
            data = input;           
        }

        // check for empty/null
        if(checkEmpty(context, data, input) == null)
            return null;

        // check length
        checkLength(context, data, input);

        // check whitelist patterns
        checkWhitelist(context, data, input);

        // check blacklist patterns
        checkBlacklist(context, data, input);

        // validation passed
        return data;

The regex gets checked before it even attempts to canonicalize your input.

Upvotes: 2

Related Questions