Top "Pdf-extraction" questions

Extracting text and other data from a PDF document, regardless of the libraries used to achieve this.

How to extract text from pdf in Python 3.7

I am trying to extract text from a PDF file using Python. My main goal is I am trying to …

python pdf python-3.7 pypdf2 pdf-extraction
How to export pdf form fields to xml automatically

I have a pdf file including form fields and need to export the data into a xml file AUTOMATICALLY. Here …

java xml python-2.7 acrobat pdf-extraction
If identifying text structure in PDF documents is so difficult, how do PDF readers do it so well?

I have been trying to write a simple console application or PowerShell script to extract the text from a large …

pdf itext pdf-extraction
How to check if PDF is scanned image or contains text

I have a large number of files, some of them are scanned images into PDF and some are full/partial …

python python-3.x pypdf2 pdfminer pdf-extraction
How to extract the contents of a table in pdf file?

I want to extract the contents of a table in pdf like like this : i wrote this java programme using …

java pdf itext text-extraction pdf-extraction
iText - Get Font size and family of a text segment

I'm currently trying to automatically extract important keywords from a PDF file. I am able to get the text information …

java pdf itext text-extraction pdf-extraction
Extracting Text from a PDF with CID fonts

I'm writing a web app that extracts a line at the top of each page in a PDF. The PDFs …

pdf fonts itextsharp pdf-extraction