I'm trying to convert programatically PDF to HTML. So far I've been using pdftohtml but our users are not happy with the results.
Here's what I need :
I'm using Ruby on Rails, but any tool working on Unix would work as I can call it from the command line. But of course a nice gem or plugin would be perfect.
I'd prefer it to be open source
It needs to be able handle images
It would be nice if there was an option to discard images if needed
It needs to be stable
It needs to return html with a layout close to the original pdf (I've tried pdftohtml and the result is not that good in a lot of cases)
Here are a couple more alternatives to pdftohtml/xpdf: