Java - HttpUrlConnection returns cached response every time

Christopher O'Toole picture Christopher O'Toole · Dec 30, 2015 · Viewed 9k times · Source

I'm trying to gather statistical data from Roblox's currency exchange for analysis. Therefore, I need up-to-date data instead of a cached result. However, it seems that no matter what I do, the result is still cached. It seems that the most intuitive option, setUseCaches(), had no effect, and setting the header manually as Cache-Control: no-cache does not seem to work either. I inspected the Cache header using Fiddler2 and saw that its value was Cache-Control: max-age=0, but it didn't seem to change the program's behavior either. Here are the relevant pieces of code:

URL:

private final static String URL = "http://www.roblox.com/my/money.aspx#/#TradeCurrency_tab";

GET Request:

    URLConnection socket = new URL( URL ).openConnection( );
    socket.setUseCaches( false );
    socket.setDefaultUseCaches( false );
    HttpURLConnection conn = ( HttpURLConnection )socket;
    conn.setUseCaches( false );
    conn.setDefaultUseCaches( false );
    conn.setRequestProperty( "Pragma",  "no-cache" );
    conn.setRequestProperty( "Expires",  "0" );
    conn.setRequestProperty( "Cookie", ".ROBLOSECURITY=" + ROBLOSECURITY );
    conn.setRequestProperty( "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" );
    conn.setRequestProperty( "Accept-Language", "en-US,en;q=0.8" );
    conn.setRequestProperty( "User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36" );
    conn.setDoInput( true );
    conn.setRequestMethod( "GET" );
    conn.connect();

    Scanner data = new Scanner( conn.getInputStream() );
    data.useDelimiter( "\\A" );
    String result = data.next();

    data.close( );
    conn.disconnect();

It may or may not be important to note that it returns a unique result every time I restart the program but not during program runtime.

Update:

Wireshark analysis (I tweaked my code a bit since last time ):

GET /my/money.aspx HTTP/1.1
Pragma: no-cache
Expires: 0
Cookie: .ROBLOSECURITY=_|WARNING:-DO-NOT-SHARE-THIS.--Sharing-this-will-allow-someone-to-log-in-as-you-and-to-steal-your-ROBUX-and-items.|*sensitive*
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Cache-Control: no-cache
Host: www.roblox.com
Connection: keep-alive

HTTP/1.1 200 OK
Cache-Control: private, s-maxage=0
Content-Type: text/html; charset=utf-8
Set-Cookie: rbx-ip=; domain=roblox.com; path=/; HttpOnly
Set-Cookie: RBXSource=rbx_acquisition_time=1/4/2016 12:45:21 AM&rbx_acquisition_referrer=&rbx_medium=Direct&rbx_source=&rbx_campaign=&rbx_adgroup=&rbx_keyword=&rbx_matchtype=&rbx_send_info=0; domain=roblox.com; expires=Wed, 03-Feb-2016 06:45:21 GMT; path=/
Access-Control-Allow-Credentials: true
Set-Cookie: rbx-ip=; domain=roblox.com; path=/; HttpOnly
Set-Cookie: RBXSource=rbx_acquisition_time=1/4/2016 12:45:21 AM&rbx_acquisition_referrer=&rbx_medium=Direct&rbx_source=&rbx_campaign=&rbx_adgroup=&rbx_keyword=&rbx_matchtype=&rbx_send_info=1; domain=roblox.com; expires=Wed, 03-Feb-2016 06:45:21 GMT; path=/
Set-Cookie: RBXEventTrackerV2=CreateDate=1/4/2016 12:45:21 AM&rbxid=59210735&browserid=3940274345; domain=roblox.com; expires=Fri, 22-May-2043 05:45:21 GMT; path=/
Set-Cookie: GuestData=UserID=-856460986; domain=.roblox.com; expires=Fri, 22-May-2043 05:45:21 GMT; path=/
P3P: CP="CAO DSP COR CURa ADMa DEVa OUR IND PHY ONL UNI COM NAV INT DEM PRE"
Date: Mon, 04 Jan 2016 06:45:20 GMT
Content-Length: 153751

Answer

Nathan Dean picture Nathan Dean · Dec 31, 2015

If the caching occurs server-side, append a cachebuster to the URL.

HttpURLConnection conn = ( HttpURLConnection )new URL( URL + "?_=" + System.currentTimeMillis() ).openConnection( );