There are nice projects that generate pdf from html/css/js files
I want to programatically control chrome or firefox browser (because they both are cross platform) to make them load a web page, run the scripts and style the page and generate a pdf file for printing.
But how do I start by controlling the browser in an automated way so that I can do something like
render-to-pdf file-to-render.html out.pdf
I can easily make this job manually by browsing the page and then printing it to pdf and I get an accurate, 100% spec compliant rendered html/css/js page on a pdf file. Even the url headers can be omitted in the pdf through configuration options in the browser. But again, how do I start in trying to automate this process?
I want to automate in the server side, the opening of the browser, navigating to a page, and generating the pdf using the browser rendered page.
I have done a lot of research I just don't know how to make the right question. I want to programatically control the browser, maybe like selenium does but to the point where I export a webpage as PDF (hence using the rendering capabilities of the browser to produce good pdfs)
I'm not an expert but PhamtomJS seems to be the right tool for the job. I'm not sure though about what headless browser it uses underneath (I guess it is chrome/chromium)
var page = require('webpage').create();
page.open('http://github.com/', function() {
var s = page.evaluate(function() {
var body = document.body,
html = document.documentElement;
var height = Math.max( body.scrollHeight, body.offsetHeight,
html.clientHeight, html.scrollHeight, html.offsetHeight );
var width = Math.max( body.scrollWidth, body.offsetWidth,
html.clientWidth, html.scrollWidth, html.offsetWidth );
return {width: width, height: height}
});
console.log(JSON.stringify(s));
// so it fit ins a single page
page.paperSize = {
width: "1980px",
height: s.height + "px",
margin: {
top: '50px',
left: '20px'
}
};
page.render('github.pdf');
phantom.exit();
});
Hope it helps.