Error While Reading Large Excel Files (xlsx) Via Apache POI

jamesT picture jamesT · Oct 23, 2012 · Viewed 24.5k times · Source

I am trying to read large excel files xlsx via Apache POI, say 40-50 MB. I am getting out of memory exception. The current heap memory is 3GB.

I can read smaller excel files without any issues. I need a way to read large excel files and then them back as response via Spring excel view.

public class FetchExcel extends AbstractView {


    @Override
    protected void renderMergedOutputModel(
            Map model, HttpServletRequest request, HttpServletResponse response) 
    throws Exception {

    String fileName = "SomeExcel.xlsx";

    response.setContentType("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");

    OPCPackage pkg = OPCPackage.open("/someDir/SomeExcel.xlsx");

    XSSFWorkbook workbook = new XSSFWorkbook(pkg);

    ServletOutputStream respOut = response.getOutputStream();

    pkg.close();
    workbook.write(respOut);
    respOut.flush();

    workbook = null;                    

    response.setHeader("Content-disposition", "attachment;filename=\"" +fileName+ "\"");


    }    

}

I first started off using XSSFWorkbook workbook = new XSSFWorkbook(FileInputStream in); but that was costly per Apache POI API, so I switched to OPC package way but still the same effect. I don't need to parse or process the file, just read it and return it.

Answer

O.C. picture O.C. · Nov 1, 2012

Here is an example to read a large xls file using sax parser.

public void parseExcel(File file) throws IOException {

        OPCPackage container;
        try {
            container = OPCPackage.open(file.getAbsolutePath());
            ReadOnlySharedStringsTable strings = new ReadOnlySharedStringsTable(container);
            XSSFReader xssfReader = new XSSFReader(container);
            StylesTable styles = xssfReader.getStylesTable();
            XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();
            while (iter.hasNext()) {
                InputStream stream = iter.next();

                processSheet(styles, strings, stream);
                stream.close();
            }
        } catch (InvalidFormatException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            e.printStackTrace();
        } catch (OpenXML4JException e) {
            e.printStackTrace();
        }

}

protected void processSheet(StylesTable styles, ReadOnlySharedStringsTable strings, InputStream sheetInputStream) throws IOException, SAXException {

        InputSource sheetSource = new InputSource(sheetInputStream);
        SAXParserFactory saxFactory = SAXParserFactory.newInstance();
        try {
            SAXParser saxParser = saxFactory.newSAXParser();
            XMLReader sheetParser = saxParser.getXMLReader();
            ContentHandler handler = new XSSFSheetXMLHandler(styles, strings, new SheetContentsHandler() {

            @Override
                public void startRow(int rowNum) {
                }
                @Override
                public void endRow() {
                }
                @Override
                public void cell(String cellReference, String formattedValue) {
                }
                @Override
                public void headerFooter(String text, boolean isHeader, String tagName) {

                }

            }, 
            false//means result instead of formula
            );
            sheetParser.setContentHandler(handler);
            sheetParser.parse(sheetSource);
        } catch (ParserConfigurationException e) {
            throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
}