I'm trying to authenticate myself to a website that uses form-based authentication (e.g., facebook.com) using the Apache HttpClient Java library.
Using this website's program as a main example: http://www.elitejavacoder.com/2013/10/http-client-form-based-authentication.html, I was able to do it - but there are a few things that I'm not understanding about this program. Here is the code:
package com.elitejavacoder.http.client;
import java.util.ArrayList;
import java.util.List;
import org.apache.http.HttpEntity;
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.params.ClientPNames;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
public class HttpClientFormAuthentication {
public static void main(String[] agrs) {
String host = "yourhostname.com";
int port = 8080;
String protocol = "http";
DefaultHttpClient client = new DefaultHttpClient();
try {
HttpHost httpHost = new HttpHost(host, port, protocol);
client.getParams().setParameter(ClientPNames.DEFAULT_HOST, httpHost);
HttpGet securedResource = new HttpGet("/secured/index.jsp");
HttpResponse httpResponse = client.execute(securedResource);
HttpEntity responseEntity = httpResponse.getEntity();
String strResponse = EntityUtils.toString(responseEntity);
int statusCode = httpResponse.getStatusLine().getStatusCode();
EntityUtils.consume(responseEntity);
System.out.println("Http status code for Unauthenticated Request: " + statusCode);// Statue code should be 200
System.out.println("Response for Unauthenticated Request: \n" + strResponse); // Should be login page
System.out.println("================================================================\n");
HttpPost authpost = new HttpPost("/j_security_check");
List<NameValuePair> nameValuePairs = new ArrayList<NameValuePair>();
nameValuePairs.add(new BasicNameValuePair("j_username", "yourusername"));
nameValuePairs.add(new BasicNameValuePair("j_password", "yourpassword"));
authpost.setEntity(new UrlEncodedFormEntity(nameValuePairs));
httpResponse = client.execute(authpost);
responseEntity = httpResponse.getEntity();
strResponse = EntityUtils.toString(responseEntity);
statusCode = httpResponse.getStatusLine().getStatusCode();
EntityUtils.consume(responseEntity);
System.out.println("Http status code for Authenticattion Request: " + statusCode);// Status code should be 302
System.out.println("Response for Authenticattion Request: \n" + strResponse); // Should be blank string
System.out.println("================================================================\n");
httpResponse = client.execute(securedResource);
responseEntity = httpResponse.getEntity();
strResponse = EntityUtils.toString(responseEntity);
statusCode = httpResponse.getStatusLine().getStatusCode();
EntityUtils.consume(responseEntity);
System.out.println("Http status code for Authenticated Request: " + statusCode);// Status code should be 200
System.out.println("Response for Authenticated Request: \n" + strResponse);// Should be actual page
System.out.println("================================================================\n");
}
catch (Exception ex) {
ex.printStackTrace();
}
}
}
I have the following questions (the line numbers I'm going to refer to are in the context of the link that I provided above, since StackOverflow doesn't allow to include line numbers):
What exactly is "/j_security_check" (line 41)? And how did the author knew that he had to use "j_security_check" instead of the name of the secured resource?
How come that the string "strResponse = EntityUtils.toString(responseEntity);" (line 49), which is two lines after "httpResponse = client.execute(authpost);" (line 47), is different from the string "strResponse = EntityUtils.toString(responseEntity);" (line 59), which is two lines after "httpResponse = client.execute(securedResource);" (line 57)?
Basically, what changes happen to "client" between lines 47 and 57?
Thank you
The /j_security_check
is a form action so that the container knows that this request is for authentication and the container handles that. /j_security_check
is a web page address for submitting authentication forms that is specific to Enterprise Java application servers.
j_username
and j_password
are names of the request parameters to submit both the username and password. These three should be named in such a way (i.e. j_security_check
, j_username
and j_password
) so that the container handles this request as an authentication request and it can retrieve the required information (i.e. username and password) from the submitted request.
The author knew that he/she needed to used /j_security_check
because he/she is assuming that he is authenticating against a J2EE app server. This is not a great assumption. Notice that the port is set to 8080? That is the port typically used by Java servers like Tomcat so they don't collide with port 80 on an HTTP server.
strResponse
at line 47 contains the content of the login request itself (which is nothing), and strResponse
at line 57 contains the content of the secured page. This is the breakdown:
The following would happen if you were doing this in a web browser.
Line 31 is the initial page access without authentication.
Lines 38-39 are displaying the login form,
Lines 41-45 are the equivalent of typing your username and password into a form.
Line 47 is like hitting the Submit button.
Line 49 is showing what the server sent in response. Notice in line 54 the comment is "Should be blank string". When you submit the username and password, what you are most concerned about in the response is the HTTP status. The comment in the line that prints out the status code says "Status code should be 302". 302 is the HTTP status which tells the browser to redirect. The response headers would contain an address for your browser to redirect to. The response headers also contain the authentication cookie. It would be nice if that were printed out too, it would help with understanding how this all works. The code is manually doing the redirect on line 57, but it is assuming that it will be redirected to the secured page it tried to access on line 31, rather than retrieving that address from the HTTP response headers.
The biggest change to client
is that by line 57 client
has the authentication cookie, similar to the browser operation. DefaultHttpClient handles all that for you under the hood.
The authentication cookie comes from the server in the form of a Set-Cookie HTTP header. This tells the client
to store the cookie. Then, when making a request, the client sends a Cookie HTTP header, along with the cookie data.
The client
initially receives the cookie on the response that contains the login form, which it stores. When the client
sends back the filled-in form, that cookie is also included in the request, and every request to the server thereafter. So once you've authenticated, the server stores that information and associates it with the cookie. Then, when subsequent requests come from the client
, the server sees the cookie and remembers that you already authenticated. The client
does all the same things a browser does to manage cookie data transfer with the server.