I am reading a text file separated with some delimiters.
Example of my text file content
Avc def efg jksjd
1 2 3 5
3 4 6 0
line by line and holding it in memory using hashmap having line numbers as key of integer type and each line of text file as List object
Consider, my map would store information like this
Integer List
1 [Avc def efg jksjd]
I am using Apache POI to write into excel. When writing into excel using Apache POI, I am following this approach, here is my code snippet
HSSFWorkbook workbook = new HSSFWorkbook();
HSSFSheet sheet = workbook.createSheet("Sample sheet");
Map<Integer, List<Object>> excelDataHolder = new LinkedHashMap<Integer, List<Object>>();
int rownum = 0;
for (Integer key : keyset) {
Row row = sheet.createRow(rownum++);
List<Object> objList = excelHolder.get(key);//excelHolder is my map
int cellnum = 0;
for (Object obj : objList) {
Cell cell = row.createCell(cellnum++);
cell.setCellValue((Date) obj);
}
}
This works quite well if the number of lines/records to be written into excel are less. Imagine, if the records are in billion number or if the text file has more lines assume in 100 000. I think, my approach fails, because createRow and createCell creates more than 100 000 objects in heap. Whatever the java to excel api, I think writing into it(excel) is based on the same approach i.e.., iteration of collection as shown above. I did some examples with aspose as well, as a result aspose also have the same problem I guess.
A recent version of apache-poi has sxssf. Shameless copy from website
SXSSF (package: org.apache.poi.xssf.streaming) is an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that are within a sliding window, while XSSF gives access to all rows in the document. Older rows that are no longer in the window become inaccessible, as they are written to the disk.
I had used it for creating spreadsheet with 1.5 million rows.