Nutrion
Nutrion

Reputation: 63

How can I ignore a parameter in a URL?

My app decodes URLs and I've noticed that when it hits an "&", the process stops and ignores the rest of the URL. It appears that URLComponents is breaking up the URL into pieces - my required parameter and the one that it believes to be a parameter.

 hxxp://www.g.com/url?u=http://a.com/test&d=another
  - scheme : "hxxp"
  - host : "www.g.com"
  - path : "/url"
  ▿ queryItems : 2 elements
    ▿ 0 : u=http://a.com/test
      - name : "u"
      ▿ value : Optional<String>
        - some : "http://a.com/test"
    ▿ 1 : d=another
      - name : "d"
      ▿ value : Optional<String>
        - some : "another"

What I need from above is the full URL after "u=".

Here is the initial call to the function that decodes the string:

if (fullDecode.contains("?u=")) {
            initialDecode = getQueryStringParameter(urlToAttempt: fullDecode, param: "u")!

Here is the function that returns the decoded string:

func getQueryStringParameter(urlToAttempt: String, param: String) -> String? {
        guard let urlDecoded = URLComponents(string: urlToAttempt) else { return nil }
        print(urlDecoded)
        return urlDecoded.queryItems?.first(where: { $0.name == param })?.value
    }

For this code, if the url was "http://www.g.com/url?u=http://a.com/test&d=another", it would return "http://a.com/test" - everything else is stripped away. I'd like it to return "http://a.com/test&d=another"

Is there a way that I'm able to do this with URLComponents, or do I need to do custom code to support this?

Update If I change the url that's being passed to " hxxp://www.g.com/url?u=http://a.com/test%26d=another", URLComponents returns the full url back with the &d=another intact. I'm now going to try to percent encode the special character before sending to the function and see if that fixes the issue.

Second Update

Here is a heavily modified type of link I need to decode:

hxxps://ue.pt.com/v2/url?u=https-3A__oe.le.com_-3Fauthkey-3D-2521IB-2DCRV-2DqQ88-26cid-3DBACDF2ED353D-26id-3DBACDF2EFB61D353D-2521549-26parId-3Droot-26o-3DOneUp&d=DwMGaQ&c=WNpQK9bFmT89misLWAzsd66s44iGV-VujF_o4whjrfc&r=Ej_UhLznQMBqt3H3IYBQkjyx4xqdnS9mLiYA&m=HOBrLfxamFr4PYdACIR-A49th_oIe3MW69N7X-E&s=bXWSJ8gaSbKSlNuIf30S7Qsa6RcMKA-EOvP577XUyq0&e=

These links are generated, which is why I have no control of how the link is built. I need to be able to decode this link to a legible URL. I have dictionaries for the special obfuscation like "-3A__" is "://" Where the above fails is where you see &d=DwM... It's not encoded, and that's where URLComponents is failing:

https://oe.le.com/?authkey=!IB-CRV-qQ88&cid=BACDF2ED353D&id=BACDF2EFB6353D!549&parId=root&o=OneUp

Does this help?

Upvotes: 1

Views: 5124

Answers (4)

Rob
Rob

Reputation: 437862

If your revised question, you give us a URL of the following:

let string = "hxxps://ue.pt.com/v2/url?u=https-3A__oe.le.com_-3Fauthkey-3D-2521IB-2DCRV-2DqQ88-26cid-3DBACDF2ED353D-26id-3DBACDF2EFB61D353D-2521549-26parId-3Droot-26o-3DOneUp&d=DwMGaQ&c=WNpQK9bFmT89misLWAzsd66s44iGV-VujF_o4whjrfc&r=Ej_UhLznQMBqt3H3IYBQkjyx4xqdnS9mLiYA&m=HOBrLfxamFr4PYdACIR-A49th_oIe3MW69N7X-E&s=bXWSJ8gaSbKSlNuIf30S7Qsa6RcMKA-EOvP577XUyq0&e="

Clearly -3A__ is ://, _-3F is /?, -26 is &, -3D is =, etc. So if you do:

let replacements = [
    "-3A__": "%3A//",
    "_-3F": "/%3F",
    "-26": "%26",
    "-3D": "%3D",
    "-2D": "%2D",
    "-25": "%25"
]

let result = replacements.reduce(string) { (string, tuple) -> String in
    return string.replacingOccurrences(of: tuple.key, with: tuple.value)
}

let components = URLComponents(string: result)!
for item in components.queryItems! {
    print(item.name, item.value!)
}

You end up with call to hxxps://ue.pt.com/v2/url with the following parameters:

u https://oe.le.com/?authkey=%21IB-CRV-qQ88&cid=BACDF2ED353D&id=BACDF2EFB61D353D%21549&parId=root&o=OneUp
d DwMGaQ
c WNpQK9bFmT89misLWAzsd66s44iGV-VujF_o4whjrfc
r Ej_UhLznQMBqt3H3IYBQkjyx4xqdnS9mLiYA
m HOBrLfxamFr4PYdACIR-A49th_oIe3MW69N7X-E
s bXWSJ8gaSbKSlNuIf30S7Qsa6RcMKA-EOvP577XUyq0
e 

The key here is that the parameters authkey, cid, id, parId, and o are for the oe.le.com URL, but all the other parameters, d, c, r, m, s, and e are not part of the URL encoded as part of u, but rather are separate parameters for the ue.pt.com URL. You do not want to include them as part of the URL for oe.le.com.

My original answer is below.


Let’s try approaching this problem from the other direction.

Consider these two scenarios where I’m building a URL. In the first, I have a single parameter to g.com which is u, which is a URL that has parameters:

var components = URLComponents(string: "http://g.com")!
components.queryItems = [
    URLQueryItem(name: "u", value: "http://a.com/test?d=foo&e=bar")
]
let url = components.url

You’ll see that url is

http://g.com/?u=http://a.com/test?d%3Dfoo%26e%3Dbar

Note that the = and the & are percent escaped as %3D and %26, respectively because they are parameters to the URL buried inside the value associated with the u parameter, not actually parameters of the g.com URL itself.

The alternative scenario is that g.com’s URL has three parameters, u, d, and e:

var components2 = URLComponents(string: "http://g.com")!
components2.queryItems = [
    URLQueryItem(name: "u", value: "http://a.com/test"),
    URLQueryItem(name: "d", value: "foo"),
    URLQueryItem(name: "e", value: "bar")
]
let url2 = components2.url

That yields:

http://g.com/?u=http://a.com/test&d=foo&e=bar

Note that the = and the & are not percent escaped because they are parameters of g.com URL, not parameters of the a.com URL contained with the u value.

You appear to be giving us a URL like generated by the second scenario, but insisting that it really is like the first scenario. If that’s true, the original URL has not been percent encoded correctly and is invalid. More likely, the second scenario applies and the parameters are of the g.com URL, not intended to be part of the u value.


For what it’s worth, note that you’ve given us a URL of:

hxxp://www.g.com/url?u=http://a.com/test&d=another

If d was really a parameter of the a.com URL, that URL would be http://a.com/test?d=another, not http://a.com/test&d=another (Note the ?, not &.)

So this is further evidence that the d parameter is really a parameter of the g.com url, and that the u parameter really is just http://a.com/test.

Upvotes: 1

Nutrion
Nutrion

Reputation: 63

I wanted to post what I did to correct this, but all of the answers that were submitted helped lead me down this road. The code that sends to the function looks like this:

if (fullDecode.contains("?u=")) {
    fullDecode = fullDecode.replacingOccurrences(of: "&", with: "%26")
    initialDecode = getQueryStringParameter(urlToAttempt: fullDecode, param: "u")!

If the uitextview fullDecode contains "?u=", then I know there is text in the view. The text should be a fully obfuscated URL, and if it contains an "&", I manually convert it to %26. This is sent to the URLComponents function BEFORE I do my other conversions using my custom dictionary.

   func getQueryStringParameter(urlToAttempt: String, param: String) -> String? {
        guard let urlDecoded = URLComponents(string: urlToAttempt) else { return nil }
        return urlDecoded.queryItems?.first(where: { $0.name == param })?.value
    }

I was really overthinking this one.

Upvotes: 0

user28434&#39;mstep
user28434&#39;mstep

Reputation: 6600

URLComponents is parsing your url properly in this case.

& in the url query can and could ever belong to the outer URL, because inner URL doesn't even starts query section(it doesn't have ? there).

So hxxp://www.g.com/url?u=http://a.com/test&d=another is parsed as:

  - scheme : "hxxp"
  - host : "www.g.com"
  - path : "/url"
  ▿ queryItems : 2 elements
    ▿ 0 : u=http://a.com/test
      - name : "u"
      ▿ value : Optional<String>
        - some : "http://a.com/test"
    ▿ 1 : d=another
      - name : "d"
      ▿ value : Optional<String>
        - some : "another"

But hxxp://www.g.com/url?u=http://a.com/test?d=another (&d replaced with ?d) is parsed like

- scheme: "hxxp"
- host: "www.g.com"
- path: "/url"
▿ queryItems: 1 element
  ▿ u=http://a.com/test?d=another
    - name: "u"
    ▿ value: Optional("http://a.com/test?d=another")
      - some: "http://a.com/test?d=another"

And here you got all your inner URL in u query param.

Upvotes: 1

cesarmarch
cesarmarch

Reputation: 637

You can split your URL string with "?u=" as a separator and get the second array element using this function

Otherwise, you can loop on all your queryItems and concatenate them.

Upvotes: 0

Related Questions