Generate PDF Behind Authentication Wall

Chords picture Chords · Apr 23, 2012 · Viewed 11.3k times · Source

I'm trying to generate a PDF using WKHTMLTOPDF that requires me to first log in. There's some on this on the internet already but I can't seem to get mine working. I'm in Terminal - nothing fancy.

I've tried (among a whole lot of other stuff):

/usr/bin/wkhtmltopdf --post username=myusername --post password=mypassword "URL to Generate" test.pdf

/usr/bin/wkhtmltopdf --username myusername --password mypassword "URL to Generate" test.pdf

/usr/bin/wkhtmltopdf --cookie-jar my.jar --post username=myusername --post password=mypassword "URL to Generate Cookie For"

username and password are both the id and the name of the input fields on the form. I am getting the my.jar file to show up, but nothing is written to it.

Specific questions:

  1. Should I be specifying the login page and/or form action anywhere?
  2. the --cookie-jar parameter has been mentioned in various places (both as being needed and otherwise). Should that be necessary, how does it work? I've created the my.jar file but how do I use it again? Referencing:

http://code.google.com/p/wkhtmltopdf/issues/detail?id=356


EDIT:

Surely someone has done this successfully? A good way to showcase an example might if someone is willing to get it to work on some popular website that requires login credentials to eliminate a potential variable.

Answer

hsanders picture hsanders · May 1, 2012

Every login form will be different for every site. What you're going to want to do is determine what all you need to pass in to that login form's target by reading the HTML on the page (which you're probably aware of). It may take an additional hidden field on top of the username/password fields to prevent cross site request forgeries.

The cookie jar parameter is a file that it stores the cookies it gets back from the webserver in. You need to specify it in the first request to the login form, and in subsequent requests to continue to use the cookie/session information that the webserver will have given you back after logging in.

So to sum it up:

  1. Look and see if there are any additional parameters on the page required.
  2. Make sure the URL you are submitting to is the same as the ACTION attribute of the form element on that page.
  3. Use the --cookie-jar parameter in both the login request and the second content request.
  4. The syntax for the --post parameters are --post username user_name_value --post password password_value