Extract Image from PDF using Java

Nick Lam picture Nick Lam · Aug 15, 2011 · Viewed 25.3k times · Source

I need to extract bar-code from PDF only (using rectangle), not converting the whole PDF into image.

The image format can be jpg/png.

Answer

zawhtut picture zawhtut · Aug 15, 2011

You can use Pdfbox

List pages = document.getDocumentCatalog().getAllPages();
Iterator iter = pages.iterator();
while( iter.hasNext() )
{
    PDPage page = (PDPage)iter.next();
    PDResources resources = page.getResources();
    Map images = resources.getImages();
    if( images != null )
    {
        Iterator imageIter = images.keySet().iterator();
        while( imageIter.hasNext() )
        {
            String key = (String)imageIter.next();
            PDXObjectImage image = (PDXObjectImage)images.get( key );
            String name = getUniqueFileName( key, image.getSuffix() );
            System.out.println( "Writing image:" + name );
            image.write2file( name );
        }
    }
}

Reference source code