Reputation: 63
My app decodes URLs and I've noticed that when it hits an "&", the process stops and ignores the rest of the URL. It appears that URLComponents is breaking up the URL into pieces - my required parameter and the one that it believes to be a parameter.
hxxp://www.g.com/url?u=http://a.com/test&d=another
- scheme : "hxxp"
- host : "www.g.com"
- path : "/url"
▿ queryItems : 2 elements
▿ 0 : u=http://a.com/test
- name : "u"
▿ value : Optional<String>
- some : "http://a.com/test"
▿ 1 : d=another
- name : "d"
▿ value : Optional<String>
- some : "another"
What I need from above is the full URL after "u=".
Here is the initial call to the function that decodes the string:
if (fullDecode.contains("?u=")) {
initialDecode = getQueryStringParameter(urlToAttempt: fullDecode, param: "u")!
Here is the function that returns the decoded string:
func getQueryStringParameter(urlToAttempt: String, param: String) -> String? {
guard let urlDecoded = URLComponents(string: urlToAttempt) else { return nil }
print(urlDecoded)
return urlDecoded.queryItems?.first(where: { $0.name == param })?.value
}
For this code, if the url was "http://www.g.com/url?u=http://a.com/test&d=another", it would return "http://a.com/test" - everything else is stripped away. I'd like it to return "http://a.com/test&d=another"
Is there a way that I'm able to do this with URLComponents, or do I need to do custom code to support this?
Update If I change the url that's being passed to " hxxp://www.g.com/url?u=http://a.com/test%26d=another", URLComponents returns the full url back with the &d=another intact. I'm now going to try to percent encode the special character before sending to the function and see if that fixes the issue.
Second Update
Here is a heavily modified type of link I need to decode:
hxxps://ue.pt.com/v2/url?u=https-3A__oe.le.com_-3Fauthkey-3D-2521IB-2DCRV-2DqQ88-26cid-3DBACDF2ED353D-26id-3DBACDF2EFB61D353D-2521549-26parId-3Droot-26o-3DOneUp&d=DwMGaQ&c=WNpQK9bFmT89misLWAzsd66s44iGV-VujF_o4whjrfc&r=Ej_UhLznQMBqt3H3IYBQkjyx4xqdnS9mLiYA&m=HOBrLfxamFr4PYdACIR-A49th_oIe3MW69N7X-E&s=bXWSJ8gaSbKSlNuIf30S7Qsa6RcMKA-EOvP577XUyq0&e=
These links are generated, which is why I have no control of how the link is built. I need to be able to decode this link to a legible URL. I have dictionaries for the special obfuscation like "-3A__" is "://" Where the above fails is where you see &d=DwM... It's not encoded, and that's where URLComponents is failing:
https://oe.le.com/?authkey=!IB-CRV-qQ88&cid=BACDF2ED353D&id=BACDF2EFB6353D!549&parId=root&o=OneUp
Does this help?
Upvotes: 1
Views: 5124
Reputation: 437862
If your revised question, you give us a URL of the following:
let string = "hxxps://ue.pt.com/v2/url?u=https-3A__oe.le.com_-3Fauthkey-3D-2521IB-2DCRV-2DqQ88-26cid-3DBACDF2ED353D-26id-3DBACDF2EFB61D353D-2521549-26parId-3Droot-26o-3DOneUp&d=DwMGaQ&c=WNpQK9bFmT89misLWAzsd66s44iGV-VujF_o4whjrfc&r=Ej_UhLznQMBqt3H3IYBQkjyx4xqdnS9mLiYA&m=HOBrLfxamFr4PYdACIR-A49th_oIe3MW69N7X-E&s=bXWSJ8gaSbKSlNuIf30S7Qsa6RcMKA-EOvP577XUyq0&e="
Clearly -3A__
is ://
, _-3F
is /?
, -26
is &
, -3D
is =
, etc. So if you do:
let replacements = [
"-3A__": "%3A//",
"_-3F": "/%3F",
"-26": "%26",
"-3D": "%3D",
"-2D": "%2D",
"-25": "%25"
]
let result = replacements.reduce(string) { (string, tuple) -> String in
return string.replacingOccurrences(of: tuple.key, with: tuple.value)
}
let components = URLComponents(string: result)!
for item in components.queryItems! {
print(item.name, item.value!)
}
You end up with call to hxxps://ue.pt.com/v2/url
with the following parameters:
u https://oe.le.com/?authkey=%21IB-CRV-qQ88&cid=BACDF2ED353D&id=BACDF2EFB61D353D%21549&parId=root&o=OneUp
d DwMGaQ
c WNpQK9bFmT89misLWAzsd66s44iGV-VujF_o4whjrfc
r Ej_UhLznQMBqt3H3IYBQkjyx4xqdnS9mLiYA
m HOBrLfxamFr4PYdACIR-A49th_oIe3MW69N7X-E
s bXWSJ8gaSbKSlNuIf30S7Qsa6RcMKA-EOvP577XUyq0
e
The key here is that the parameters authkey
, cid
, id
, parId
, and o
are for the oe.le.com
URL, but all the other parameters, d
, c
, r
, m
, s
, and e
are not part of the URL encoded as part of u
, but rather are separate parameters for the ue.pt.com
URL. You do not want to include them as part of the URL for oe.le.com
.
My original answer is below.
Let’s try approaching this problem from the other direction.
Consider these two scenarios where I’m building a URL. In the first, I have a single parameter to g.com
which is u
, which is a URL that has parameters:
var components = URLComponents(string: "http://g.com")!
components.queryItems = [
URLQueryItem(name: "u", value: "http://a.com/test?d=foo&e=bar")
]
let url = components.url
You’ll see that url is
http://g.com/?u=http://a.com/test?d%3Dfoo%26e%3Dbar
Note that the =
and the &
are percent escaped as %3D
and %26
, respectively because they are parameters to the URL buried inside the value associated with the u
parameter, not actually parameters of the g.com
URL itself.
The alternative scenario is that g.com
’s URL has three parameters, u
, d
, and e
:
var components2 = URLComponents(string: "http://g.com")!
components2.queryItems = [
URLQueryItem(name: "u", value: "http://a.com/test"),
URLQueryItem(name: "d", value: "foo"),
URLQueryItem(name: "e", value: "bar")
]
let url2 = components2.url
That yields:
http://g.com/?u=http://a.com/test&d=foo&e=bar
Note that the =
and the &
are not percent escaped because they are parameters of g.com
URL, not parameters of the a.com
URL contained with the u
value.
You appear to be giving us a URL like generated by the second scenario, but insisting that it really is like the first scenario. If that’s true, the original URL has not been percent encoded correctly and is invalid. More likely, the second scenario applies and the parameters are of the g.com
URL, not intended to be part of the u
value.
For what it’s worth, note that you’ve given us a URL of:
hxxp://www.g.com/url?u=http://a.com/test&d=another
If d
was really a parameter of the a.com
URL, that URL would be http://a.com/test?d=another
, not http://a.com/test&d=another
(Note the ?
, not &
.)
So this is further evidence that the d
parameter is really a parameter of the g.com
url, and that the u
parameter really is just http://a.com/test
.
Upvotes: 1
Reputation: 63
I wanted to post what I did to correct this, but all of the answers that were submitted helped lead me down this road. The code that sends to the function looks like this:
if (fullDecode.contains("?u=")) {
fullDecode = fullDecode.replacingOccurrences(of: "&", with: "%26")
initialDecode = getQueryStringParameter(urlToAttempt: fullDecode, param: "u")!
If the uitextview fullDecode contains "?u=", then I know there is text in the view. The text should be a fully obfuscated URL, and if it contains an "&", I manually convert it to %26. This is sent to the URLComponents function BEFORE I do my other conversions using my custom dictionary.
func getQueryStringParameter(urlToAttempt: String, param: String) -> String? {
guard let urlDecoded = URLComponents(string: urlToAttempt) else { return nil }
return urlDecoded.queryItems?.first(where: { $0.name == param })?.value
}
I was really overthinking this one.
Upvotes: 0
Reputation: 6600
URLComponents
is parsing your url properly in this case.
&
in the url query can and could ever belong to the outer URL
, because inner URL
doesn't even starts query section(it doesn't have ?
there).
So hxxp://www.g.com/url?u=http://a.com/test&d=another
is parsed as:
- scheme : "hxxp"
- host : "www.g.com"
- path : "/url"
▿ queryItems : 2 elements
▿ 0 : u=http://a.com/test
- name : "u"
▿ value : Optional<String>
- some : "http://a.com/test"
▿ 1 : d=another
- name : "d"
▿ value : Optional<String>
- some : "another"
But hxxp://www.g.com/url?u=http://a.com/test?d=another
(&d
replaced with ?d
) is parsed like
- scheme: "hxxp"
- host: "www.g.com"
- path: "/url"
▿ queryItems: 1 element
▿ u=http://a.com/test?d=another
- name: "u"
▿ value: Optional("http://a.com/test?d=another")
- some: "http://a.com/test?d=another"
And here you got all your inner URL
in u
query param.
Upvotes: 1
Reputation: 637
You can split your URL string with "?u=" as a separator and get the second array element using this function
Otherwise, you can loop on all your queryItems and concatenate them.
Upvotes: 0