Determine MS Excel file type with Apache POI

Alexey Berezkin picture Alexey Berezkin · Jan 25, 2013 · Viewed 25.6k times · Source

Is there a way to determine MS Office Excel file type in Apache POI? I need to know in what format is the Excel file: in Excel '97(-2007) (.xls) or Excel 2007 OOXML (.xlsx).

I suppose I could do something like this:

int type = PoiTypeHelper.getType(file);
switch (type) {
case PoiType.EXCEL_1997_2007:
   ...
   break;
case PoiType.EXCEL_2007:
   ...
   break;
default:
   ...
}

Thanks.

Answer

Gagravarr picture Gagravarr · Jan 25, 2013

Promoting a comment to an answer...

If you're going to be doing something special with the files, then rjokelai's answer is the way to do it.

However, if you're just going to be using the HSSF / XSSF / Common SS usermodel, then it's much simpler to have POI do it for you, and use WorkbookFactory to have the type detected and opened for you. You'd do something like:

 Workbook wb = WorkbookFactory.create(new File("something.xls"));

or

 Workbook wb = WorkbookFactory.create(request.getInputStream());

Then if you needed to do something special, test if it's a HSSFWorkbook or XSSFWorkbook. When opening the file, use a File rather than an InputStream if possible to speed things up and save memory.

If you don't know what your file is at all, use Apache Tika to do the detection - it can detect a huge number of different file formats for you.