Jean-Paul
Jean-Paul

Reputation: 21150

Download a cookie to make new GET request

I am trying to do a PHP GET request to a website:

enter image description here

The problem is that this website will only process my request if I attach Cookie information to the header of the request.

Or in picture terms, if I disable cookies in my browser, I get this:

enter image description here

Which means the website recognises that it's my first time 'visiting' the site.

Problem is, that if I now use the search bar on the top right, it will not process this request: it will just show the same (general) screen.

E.g.: if I have cookies disabled and I search for "AAPL", it will not show any results.

Now if I have cookies enabled, the request is handled just fine:

enter image description here

And so the "AAPL" results are shown.

You can try this yourself as well:

With cookies enabled, visit http://www.pennystocktweets.com/user_posts/feeds?cat=search&lptyp=prep&usrstk=AAPL

With cookies disabled, visit the link again: http://www.pennystocktweets.com/user_posts/feeds?cat=search&lptyp=prep&usrstk=AAPL

Now compare the responses, only the first one is correct.

This means that the website only works after the client has downloaded a cookie, and then has made another (new) GET request to the server with this Cookie information attached.

(Does this imply that the website needs a session-cookie to function correctly?)

Now what I'm trying to do is imitate the request with Apache HttpClient like so:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.net.CookieHandler;
import java.net.CookieManager;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Date;
import java.util.List;
import java.util.StringTokenizer;

import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.HttpClient;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicNameValuePair;

public class downloadTweets {

  private String cookies;
  private HttpClient client = new DefaultHttpClient();
  private final String USER_AGENT = "Mozilla/5.0";

  public static void main(String[] args) throws Exception {

    String  ticker  = "AAPL";  
    String  lptyp   = "prep";  
    int     opid    = 0;
    int     lpid    = 0;

    downloadTweets test = new downloadTweets();

    String url = test.constructURL(ticker, lptyp, opid, lpid);

    // make sure cookies is turn on
    CookieHandler.setDefault(new CookieManager());

    downloadTweets http = new downloadTweets();

    String page = http.GetPageContent(url, ticker);

    System.out.println(page);
  }

  public String constructURL(String ticker, String lptyp, int opid, int lpid)
  {
      String link = "http://www.pennystocktweets.com/user_posts/feeds?cat=search" +

              "&lptyp="     +   lptyp   +
              "&usrstk="    +   ticker;

      if (opid != 0)
      {
          link = link +
              "&opid="      +   opid    +
              "&lpid="      +   lpid;
      }

      return link;
  }

  private String GetPageContent(String url, String ticker) throws Exception {

    HttpGet request = new HttpGet(url);

    String RefererLink = "http://www.pennystocktweets.com/search/post/" + ticker.toUpperCase();

    request.setHeader("Host", "www.pennystocktweets.com");
    request.setHeader("Connection", "Keep-alive");
    request.setHeader("Accept", "*/*");
    request.setHeader("X-Requested-With", "XMLHttpRequest");
    request.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36");
    request.setHeader("Referer", RefererLink);
    request.setHeader("Accept-Language", "nl-NL,nl;q=0.8,en-US;q=0.6,en;q=0.4,fr;q=0.2");

    HttpResponse response = client.execute(request);
    int responseCode = response.getStatusLine().getStatusCode();

    System.out.println("\nSending 'GET' request to URL : " + url);
    System.out.println("Response Code : " + responseCode);

    BufferedReader rd = new BufferedReader(
                new InputStreamReader(response.getEntity().getContent()));

    StringBuffer result = new StringBuffer();
    String line = "";
    while ((line = rd.readLine()) != null) {
        result.append(line);
    }

    // set cookies
    setCookies(response.getFirstHeader("Set-Cookie") == null ? "" : 
                     response.getFirstHeader("Set-Cookie").toString());

    return result.toString();

  }

  public String getCookies() {
    return cookies;
  }

  public void setCookies(String cookies) {
    this.cookies = cookies;
  }
}

Now, the same thing holds: if I attach (my) cookie information, the response works just fine, and if I don't the response doesn't work.

But I don't know how to get the cookie information and then use it in a new GET request.

So my question is:

How can I make 2 requests to a website such that:

On the first GET request, I get cookie information from the website and store this in my Java program

On the second GET request, I use the stored cookie information (as a Header) to make a new request.

Note: I don't know if the cookie is a normal cookie or a session cookie but I suspect it's a session-cookie!

All help is greatly appreciated!

Upvotes: 0

Views: 1814

Answers (1)

erny
erny

Reputation: 2489

As the documents of Apache commons httpclient states in the HttpClient Cookie handling part: HttpClient supports automatic management of cookies, including allowing the server to set cookies and automatically return them to the server when required. It is also possible to manually set cookies to be sent to the server.

Whenever the http client receives cookies they are persisted into HttpState and added automatically to the new request. This is the default behavior.

In the following example code, we can see the cookies returned by two GET requests. We can't see directly the cookies sent to the server, but we can use a tool such as a protocol/net sniffer or ngrep to see the data transmitted over the network:

import java.io.IOException;

import org.apache.commons.httpclient.Cookie;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.HttpState;
import org.apache.commons.httpclient.cookie.CookiePolicy;
import org.apache.commons.httpclient.methods.GetMethod;

public class HttpTest {

public static void main(String[] args) throws HttpException, IOException {
    String url = "http://www.whatarecookies.com/cookietest.asp";
    HttpClient client = new HttpClient();
    client.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
    HttpMethod method = new GetMethod(url);
    int res = client.executeMethod(method);
    System.out.println("Result: " + res);
    printCookies(client.getState());
    method = new GetMethod(url);
    res = client.executeMethod(method);
    System.out.println("Result: " + res);
    printCookies(client.getState());
}
public static void printCookies(HttpState state){
    System.out.println("Cookies:");
    Cookie[] cookies = state.getCookies();
    for (Cookie cookie : cookies){
        System.out.println("  " + cookie.getName() + ": " + cookie.getValue());
    }               
}   
}

This is the output:

Result: 200
Cookies:
  active_template::468: %2Fresponsive%2Fthree_column_inner_ad3b74de5a1c2f311bee7bca5c368aaa4e:b326b5062b2f0e69046810717534cb09
Result: 200
Cookies:
  active_template::468: %2Fresponsive%2Fthree_column_inner_ad%2C+3b74de5a1c2f311bee7bca5c368aaa4e%3Db326b5062b2f0e69046810717534cb09
  3b74de5a1c2f311bee7bca5c368aaa4e: b326b5062b2f0e69046810717534cb09

Here is an excerpt of ngrep:

MacBook$ sudo ngrep -W byline -d en0 "" host www.whatarecookies.com
interface: en0 (192.168.11.0/255.255.255.0)
filter: (ip) and ( dst host www.whatarecookies.com )
#####
T 192.168.11.70:56267 -> 54.228.218.117:80 [AP]
GET /cookietest.asp HTTP/1.1.
User-Agent: Jakarta Commons-HttpClient/3.1.
Host: www.whatarecookies.com.
.

####
T 54.228.218.117:80 -> 192.168.11.70:56267 [A]
HTTP/1.1 200 OK.
Server: nginx/1.4.0.
Date: Wed, 27 Nov 2013 10:22:14 GMT.
Content-Type: text/html; charset=iso-8859-1.
Content-Length: 36397.
Connection: keep-alive.
Vary: Accept-Encoding.
Vary: Cookie,Host,Accept-Encoding.
Set-Cookie: active_template::468=%2Fresponsive%2Fthree_column_inner_ad; expires=Fri, 29-Nov-2013 10:22:01 GMT; path=/; domain=whatarecookies.com; httponly.
Set-Cookie: 3b74de5a1c2f311bee7bca5c368aaa4e=b326b5062b2f0e69046810717534cb09; expires=Thu, 27-Nov-2014 10:22:01 GMT.
X-Middleton-Response: 200.
Cache-Control: max-age=0, no-cache.
X-Mod-Pagespeed: 1.7.30.1-3609.
.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd">
...

##
T 192.168.11.70:56267 -> 54.228.218.117:80 [AP]
GET /cookietest.asp HTTP/1.1.
User-Agent: Jakarta Commons-HttpClient/3.1.
Host: www.whatarecookies.com.
Cookie: active_template::468=%2Fresponsive%2Fthree_column_inner_ad.
Cookie: 3b74de5a1c2f311bee7bca5c368aaa4e=b326b5062b2f0e69046810717534cb09.
.

##
T 54.228.218.117:80 -> 192.168.11.70:56267 [A]
HTTP/1.1 200 OK.
Server: nginx/1.4.0.
Date: Wed, 27 Nov 2013 10:22:18 GMT.
Content-Type: text/html; charset=iso-8859-1.
Content-Length: 54474.
Connection: keep-alive.
Vary: Accept-Encoding.
Vary: Cookie,Host,Accept-Encoding.
Set-Cookie: active_template::468=%2Fresponsive%2Fthree_column_inner_ad%2C+3b74de5a1c2f311bee7bca5c368aaa4e%3Db326b5062b2f0e69046810717534cb09; expires=Fri, 29-Nov-2013 10:22:05 GMT; path=/; domain=whatarecookies.com; httponly.
Set-Cookie: 3b74de5a1c2f311bee7bca5c368aaa4e=b326b5062b2f0e69046810717534cb09; expires=Thu, 27-Nov-2014 10:22:05 GMT.
X-Middleton-Response: 200.
Cache-Control: max-age=0, no-cache.
X-Mod-Pagespeed: 1.7.30.1-3609.
.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd">
...

Upvotes: 1

Related Questions