Java - HttpUrlConnection returns cached response every time

I'm trying to gather statistical data from Roblox's currency exchange for analysis. Therefore, I need up-to-date data instead of a cached result. However, it seems that no matter what I do, the result is still cached. It seems that the most intuitive option, setUseCaches(), had no effect, and setting the header manually as Cache-Control: no-cache does not seem to work either. I inspected the Cache header using Fiddler2 and saw that its value was Cache-Control: max-age=0, but it didn't seem to change the program's behavior either. Here are the relevant pieces of code:

URL:

private final static String URL = "http://www.roblox.com/my/money.aspx#/#TradeCurrency_tab";

GET Request:

    URLConnection socket = new URL( URL ).openConnection( );
    socket.setUseCaches( false );
    socket.setDefaultUseCaches( false );
    HttpURLConnection conn = ( HttpURLConnection )socket;
    conn.setUseCaches( false );
    conn.setDefaultUseCaches( false );
    conn.setRequestProperty( "Pragma",  "no-cache" );
    conn.setRequestProperty( "Expires",  "0" );
    conn.setRequestProperty( "Cookie", ".ROBLOSECURITY=" + ROBLOSECURITY );
    conn.setRequestProperty( "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" );
    conn.setRequestProperty( "Accept-Language", "en-US,en;q=0.8" );
    conn.setRequestProperty( "User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36" );
    conn.setDoInput( true );
    conn.setRequestMethod( "GET" );
    conn.connect();

    Scanner data = new Scanner( conn.getInputStream() );
    data.useDelimiter( "\\A" );
    String result = data.next();

    data.close( );
    conn.disconnect();

It may or may not be important to note that it returns a unique result every time I restart the program but not during program runtime.

Update:

Wireshark analysis (I tweaked my code a bit since last time ):

GET /my/money.aspx HTTP/1.1
Pragma: no-cache
Expires: 0
Cookie: .ROBLOSECURITY=_|WARNING:-DO-NOT-SHARE-THIS.--Sharing-this-will-allow-someone-to-log-in-as-you-and-to-steal-your-ROBUX-and-items.|*sensitive*
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Cache-Control: no-cache
Host: www.roblox.com
Connection: keep-alive

HTTP/1.1 200 OK
Cache-Control: private, s-maxage=0
Content-Type: text/html; charset=utf-8
Set-Cookie: rbx-ip=; domain=roblox.com; path=/; HttpOnly
Set-Cookie: RBXSource=rbx_acquisition_time=1/4/2016 12:45:21 AM&rbx_acquisition_referrer=&rbx_medium=Direct&rbx_source=&rbx_campaign=&rbx_adgroup=&rbx_keyword=&rbx_matchtype=&rbx_send_info=0; domain=roblox.com; expires=Wed, 03-Feb-2016 06:45:21 GMT; path=/
Access-Control-Allow-Credentials: true
Set-Cookie: rbx-ip=; domain=roblox.com; path=/; HttpOnly
Set-Cookie: RBXSource=rbx_acquisition_time=1/4/2016 12:45:21 AM&rbx_acquisition_referrer=&rbx_medium=Direct&rbx_source=&rbx_campaign=&rbx_adgroup=&rbx_keyword=&rbx_matchtype=&rbx_send_info=1; domain=roblox.com; expires=Wed, 03-Feb-2016 06:45:21 GMT; path=/
Set-Cookie: RBXEventTrackerV2=CreateDate=1/4/2016 12:45:21 AM&rbxid=59210735&browserid=3940274345; domain=roblox.com; expires=Fri, 22-May-2043 05:45:21 GMT; path=/
Set-Cookie: GuestData=UserID=-856460986; domain=.roblox.com; expires=Fri, 22-May-2043 05:45:21 GMT; path=/
P3P: CP="CAO DSP COR CURa ADMa DEVa OUR IND PHY ONL UNI COM NAV INT DEM PRE"
Date: Mon, 04 Jan 2016 06:45:20 GMT
Content-Length: 153751

Upvotes: 18

Views: 13244

Answers (6)

Younes Regaieg
Younes Regaieg

Reputation: 4156

I would suggest you to do the following operation on your URL before opening your URLConnection socket :

URLConnection socket = new URL( URL.replaceFirst("#", "?cacheFrom=" + System.currentTimeMillis()+"#") ).openConnection( );

Upvotes: 0

assylias
assylias

Reputation: 328598

Have you tried the following headers:

Cache-Control: no-cache
Pragma: no-cache
If-Modified-Since: Sat, 1 Jan 2000 00:00:00 GMT

Upvotes: 1

Tawan
Tawan

Reputation: 457

I missing context (how the given piece of code invoked multiple times) to pin down the problem accurately, but it could be due to reusing the socket object instead of instantiating a new one for each request.

Once the connection is open, the useCache setting won't matter. Have a look at the implementation of sun.net.www.protocol.http.HttpURLConnection#connect:

protected void plainConnect()  throws IOException {
  if (connected) {
        return;         
  }
  // try to see if request can be served from local cache
  if (cacheHandler != null && getUseCaches()) {
  // ..
}

If the connection was opened, it will return immediatly and reuse the existing InputStream instance.

Upvotes: 1

PNS
PNS

Reputation: 770

Seeing as you have tried most of the cache settings. It could be that it is not your client, but their service that causes this to happen. I can see from your wireshark info that you have "Connection Keep-Alive". Perhaps you could try and set that to "Connection Close" since you say that every time you restart your program you get a non-cached result.

This may not be ideal in a production setting but perhaps it could give you some insight as to what is happening.

Upvotes: 2

Jim Garrison
Jim Garrison

Reputation: 86774

I notice you are not telling the local HttpURLConnection to bypass its own caches.

HttpURLConnection inherits the method setUseCaches(boolean) from URLConnection. From the Javadoc for setUseCaches(boolean)

Sets the value of the useCaches field of this URLConnection to the specified value.

Some protocols do caching of documents. Occasionally, it is important to be able to "tunnel through" and ignore the caches (e.g., the "reload" button in a browser). If the UseCaches flag on a connection is true, the connection is allowed to use whatever caches it can. If false, caches are to be ignored. The default value comes from DefaultUseCaches, which defaults to true.

Upvotes: 3

Nathan Dean
Nathan Dean

Reputation: 89

If the caching occurs server-side, append a cachebuster to the URL.

HttpURLConnection conn = ( HttpURLConnection )new URL( URL + "?_=" + System.currentTimeMillis() ).openConnection( );

Upvotes: 8

Related Questions