How to set Cookies at Http Get method using Java

fysob picture fysob · Jul 14, 2010 · Viewed 49.9k times · Source

I want to do a manual GET with cookies in order to download and parse a web page. I need to extract the security token, in order to make a post at the forum. I have completed the login, have read the response and extracted the cookies (3 pairs of (name,value) ). I then wrote the String containing the cookies like this:

CookieString="name1=value1; name2=value2; name3=value3"

I then do the following

HttpURLConnection connection
connection = (HttpURLConnection)(new URL(Link).openConnection());
connection.setRequestMethod("GET");
connection.setRequestProperty("Connection", "Keep-Alive");
connection.setRequestProperty("Cookie", CookieString );
connection.connect();

I then read the page but it shows that I am not logged at the forum. What am I doing wrong?

edit: I know that I must extract the security token if I want to make a post. My train of thought was that in order to extract it, I need to GET this particular page. But for the security token to be as a hidden field I must be online, thus I needed the cookies. But when I GET the page and I set the cookies as mentioned above i get the page as a guest, it shows that I am not online and the value of security token is guest which is not useful for me. I will check the link you gave me and hopefully will find a solution.

Answer

BalusC picture BalusC · Jul 14, 2010

To be sure, you should be gathering the cookies from the response's Set-Cookie headers. To send them back in the subsequent requests, you should set them one by one using URLConnection#addRequestProperty().

Basically:

// ...

// Grab Set-Cookie headers:
List<String> cookies = connection.getHeaderFields().get("Set-Cookie");

// ...

// Send them back in subsequent requests:
for (String cookie : cookies) {
    connection.addRequestProperty("Cookie", cookie.split(";", 2)[0]);
}

// ...

The split(";", 2) is there to get rid of cookie attributes which are irrelevant for the server side like expires, path, etc.

For a more convenienced HTTP client I'd suggest to have a look at Apache HttpComponents Client. It can handle all the cookie stuff more transparently.

See also:


Update: as per the comments, this is not a cookie problem. A wrong request token means that the server has CSRF/bot prevention builtin (to prevent people like you). You need to extract the token as a hidden input field from the requested page with the form and resend it as a request parameter. Jsoup may be useful to extract all (hidden) input fields. Don't forget to pass the name-value pair of the button as well which you'd like to "press" programmatically. Also see the abovementioned link for more hints.

In the future, you should really be more clear about the exact error you retrieve and not guess something in the wild. Copypaste the exact error message and so on.