Reputation: 729
Corporate environment: between me and the internets, there's a load balanced proxy. MSIE connection settings point to a proxypac
file which says, roughly:
function FindProxyForURL(url, host)
{
if ((shExpMatch(host,"intranet1.corp")) || (shExpMatch(host,"intranet2.corp")))
return "DIRECT";
else
return "PROXY proxy1.corp:3128; PROXY proxy2.corp:3128";
}
My question: programatically, how can I determine which proxy I'm using? I'm on Windows.
Upvotes: 0
Views: 2249
Reputation:
Your sample PAC file is configured for failover, not for load balancing.
In other words, your IE will always try to connect through "proxy1.corp" first and then, if it fails, it will try to connect through "proxy2.corp".
Be aware that IE uses the "Automatic Proxy Result Cache", as you can read here:
In theory, the FindProxyForURL() function is invoked every time that an object is about to be fetched by the web browser. In practice, however, Microsoft's Internet Explorer has what Microsoft terms an "Automatic Proxy Result Cache". Whenever a proxy HTTP server (located using the results of a call to the FindProxyForURL() function or otherwise) is successfully contacted to fetch an object, the APR cache is updated to contain that pair. If, when about to call the FindProxyForURL() function, Internet Explorer finds the host already listed in the APR cache, it uses the proxy HTTP server listed in the APR cache entry instead of calling the FindProxyForURL() function again for the same host. (The intent of the APR cache is to attempt to reduce the number of times that the JavaScript function has to be run, and thus reduce the overhead of fetching objects.) Because Internet Explorer's APR cache is indexed by hostname, this means that it is impossible for a PAC script to reliably yield multiple different results according to any part of a URL in addition to the hostname. It is impossible, for example, to provide different proxy configurations according to the path portions of URLs on a single host. Because Internet Explorer's APR cache caches the proxy HTTP server rather than the full results of the FindProxyForURL() function, this means that fallback from one proxy HTTP server to another does not occur in the event of a problem, even if the FindProxyForURL() function returned a list of several proxy HTTP servers.
Microsoft's KnowledgeBase article #271361 summarizes these problems and describes how to turn Internet Explorer's APR cache off. Microsoft's Internet Explorer also caches information about "bad" proxy HTTP servers for 30 minutes. This has no direct bearing upon PAC scripts, except that it often causes confusion when people are setting up a proxy HTTP server and creating a PAC script at the same time, and a problem with the proxy HTTP server, causing it to be cached as "bad" for 30 minutes, is misdiagnosed as a problem with the PAC script.
If you want to use the PAC file for load balancing, have a look at this page for some examples.
However, even if you have a load balancing between the two proxies "proxy1" and "proxy2", the used proxy could change at every request; also, if you have more than one IE instance active at any time, you could have a situation where some instances are using "proxy1" and other instances are using "proxy2".
So, to answer your question, one solution could be to check the "via" field in the HTTP header response.
For example, consider the following HTML page:
<!DOCTYPE html>
<html>
<head>
<title>Javascript Proxy Detection</title>
<script language="javascript">
function doRequest(url) {
if (typeof XMLHttpRequest != 'undefined') {
try {
var xmlhttp = new XMLHttpRequest();
xmlhttp.open("GET", url, true);
xmlhttp.onreadystatechange = function() {
if (xmlhttp.readyState == 4) {
document.myForm.txt.value = xmlhttp.getAllResponseHeaders();
}
}
xmlhttp.send(null);
} catch (e) {
alert(e.message);
}
} else {
alert('no XMLHttpRequest');
}
}
</script>
</head>
<body>
<form name="myForm">
<textarea cols="50" rows="10" name="txt"></textarea><br />
<input type="button" value="Test Proxy" onclick="doRequest(location.href)">
</form>
</body>
</html>
Requesting this page through a direct connection (without proxy), then clicking on the "Test Proxy" button, you will get this kind of output:
Content-Encoding: gzip
Content-Length: 470
Server: Apache/2.2.17 (Ubuntu)
Vary: Accept-Encoding
Content-Type: text/html
Accept-Ranges: bytes
By requesting the same page through a proxy (in my case Squid), you will also get the "via" field:
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 470
Content-Type: text/html
X-Cache: MISS from ****
X-Cache-Lookup: HIT from ****:3128
Via: 1.1 ****:3128 (squid/2.7.STABLE9)
So, by checking for the existance of the "via" field in the header, you should be able do determine:
Further references:
Upvotes: 3