PDF to HTML with PHP

silvia picture silvia · Jul 9, 2015 · Viewed 11.4k times · Source

I need to convert some pdf files into HTML. I downloaded pdftohtml for PHP but I don't know how to use it. I am trying to run it with this code:

<?php  
    include 'pdf-to-html-master/src/Gufy/PdfToHtml.php';
    $pdf = new \Gufy\PdfToHtml;
    $pdf->open('1400.pdf');
    $pdf->generate();
?>

This results in a blank web page.

What do I need to modify? What is the correct code to run this script?

Answer

varunsinghal picture varunsinghal · Jul 9, 2015

First option is using poppler utils

<?php
// if you are using composer, just use this
include 'vendor/autoload.php';
// if not, use this
include 'src/Gufy/PdfToHtml.php';
// initiate 
$pdf = new \Gufy\PdfToHtml;
// opening file
$pdf->open('file.pdf');
// set different output directory for generated html files
$pdf->setOutputDirectory('/your/absolute/directory/path');
// do this if you want to convert in the same directory as file.pdf
$pdf->generate();
// you think your generated files is annoying? simple do this to remove the whole files
$pdf->clearOutputDirectory();
?>

Download library from here Second option could be using pdf.js

PDFJS.getDocument('helloworld.pdf')