Zasha
Zasha

Reputation: 67

URL-encoded Query String according to Amazon AWS specs

I'm currently implementing an application to generate Amazon AWS Signature Version 4 signatures (for details on the signing process, refer to this page: http://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html). Conveniently, Amazon also supplies a test suite for those signatures. There is one test case, however, that I can't really figure out. Note that my question only refers to the first step of the signing process (generating the canonical request), and specifically to the creation of the canonical query string.

The test case input HTTP request looks like this:

POST /?@#$%^&+=/,?><`";:\|][{} =@#$%^&+=/,?><`";:\|][{}  http/1.1
Date:Mon, 09 Sep 2011 23:36:00 GMT
Host:host.foo.com

And this is the expected result for the canonical request:

POST
/
%20=%2F%2C%3F%3E%3C%60%22%3B%3A%5C%7C%5D%5B%7B%7D&%40%23%24%25%5E=
date:Mon, 09 Sep 2011 23:36:00 GMT
host:host.foo.com

date;host
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

The third line in the canonical request denotes the url-encoded query string. But I don't really get how they get there from the input, even following the rules stated in the Sig V4 documentation:

To construct the canonical query string, complete the following steps:

URI-encode each parameter name and value according to the following rules:

Do not URL-encode any of the unreserved characters that RFC 3986 defines:

A-Z, a-z, 0-9, hyphen ( - ), underscore ( _ ), period ( . ), and tilde ( ~ ). Percent-encode all other characters with %XY, where X and Y are hexadecimal characters (0-9 and uppercase A-F).

For example, the space character must be encoded as %20 (not using '+', as some encoding schemes do) and extended UTF-8 characters must be in the form %XY%ZA%BC.

Sort the encoded parameter names by character code (that is, in strict ASCII order). For example, a parameter name that begins with the uppercase letter F (ASCII code 70) precedes a parameter name that begins with a lowercase letter b (ASCII code 98).

Build the canonical query string by starting with the first parameter name in the sorted list.

For each parameter, append the URI-encoded parameter name, followed by the character '=' (ASCII code 61), followed by the URI-encoded parameter value. Use an empty string for parameters that have no value.

Append the character '&' (ASCII code 38) after each parameter value except for the last value in the list.

Can someone explain? Thanks a bunch in advance!

Upvotes: 3

Views: 3220

Answers (1)

lsowen
lsowen

Reputation: 3828

I think I might have figured it out. Decomposing the query string @#$%^&+=/,?><``";:\|][{} =@#$%^&+=/,?><``";:\|][{} into key/value pairs (in order of appearance):

  1. Key @#$%^, Value None
  2. Key +, Value /,?><``";:\|][{}
  3. Key @#$%^, Value None
  4. Key +, Value /,?><``";:\|][{}

Based on https://stackoverflow.com/a/1746566/3108853, there isn't a standard on what to do with duplicate keys, so it looks like Amazon simply chose to either ignore or overwrite (impossible to tell with this test case, because the values of the duplicate keys are the same).

Lastly, accounting for the = before the second occurrence of @#$%^, I believe it is parsed as a key/value pair with an null key value, so it is being dropped all together.

Upvotes: 2

Related Questions