I have a requirement to split a large pdf document into smaller files based on the content of the file. We use BCL easyPDF to manipulate pdf files. easyPDF can split pdf documents based on a page number, but it cannot split the document based on the file content. Also it does not have a search function (as far as I can tell, if I am wrong please someone let me know.) to determine the location of the content.
Now can someone tell me how I can find the location of text in a pdf file using .net?
Thanks
You might try Docotic.Pdf library for your task.
The library can extract text from PDFs (with or without formatting).
Or you could just retrieve a collection of words with their bounding rectangles from PDFs. This should help you to find location of the text in a file.
Disclaimer: I work for the vendor of the library.