hi i am creating an app which can read file like pdf/doc/docx/xls/ppt etc and display it to user.I have read that if in doc there is some images and a table , apache POI can't help because it can't create borders for table.going with aspose is not a problem ,but i should have strong reason to use aspose instead of apache POI which is open source.
can any one suggest me which one i should go with? And what are the limitations of Apache POI and Aspose?
We have evaluated both tools and came up with a review, mainly about Aspose.Words because it works better for our need. But we also write about Apache POI. I'm pasting the review here for your reference.
We are a company that develops online word processor. One big challenge is to convert Microsoft Word DOC, DOCX and RTF contents to and from our proprietary data model. Due to limitation of the thin client and the complex nature of Microsoft Word document, we must handle the conversion in the server side.
Our server-side technology is java/spring/hibernate. We realized that there aren’t many options out there in java space that deals with DOC(X) processing. And we only look for proven and mature products. We have evaluated Apache POI in public domain. One main problem we found with Apache POI is that there are many seemingly independent components under the hood and we must use two different components to handle DOC and DOCX. The POI component that handles DOCX is fairly new and doesn’t have many features yet. As far as RTF is concerned, Apache POI simply doesn’t support it.
Knowing that Apache POI isn’t a good choice for our application, we checked out Aspose.Words for java. In fact, it’s only commercial product in the space, as far as our search goes. The evaluation was very smooth. We easily created a Maven artifact for the Aspose library and integrated the library into our backend web application. Based on our experience, we believe Aspose.Words for java is the top product in this space and is actually far superior to any other solutions. Due to space limitation, we can only share with you two main features that are most valuable to us, from a technology perspective.
First, Aspose.Words uses a consistent, intuitive and well-documented DOM model as underlying document structure. This DOM model is straight-forward and easy to understand and turns out to be quite expressive and powerful. This DOM model is actually different from OOXML’s DOM model. We like Aspose’s DOM model a lot better. It reminds us of the difference between JDOM and W3C model for XML, where JDom’s model is way simpler and more intuitive yet powerful enough to deal with most manipulations ever needed for a business application. To our surprise, one single DOM model is used across all formats supported by Aspose.Words, including but not limited to DOC, DOCX and RTF. This particular design/feature of Aspose.Words greatly lowers the level of effort on our side because we only need to develop one code base to handle all three formats currently needed by our application, as well as other formats (such as PostScript) that may be needed in the future. We found this design/architecture to be the key technology strength of Aspose.Words, in addition to its rich features and APIs.
Second, Aspose.Words is able to preserve all OLE components in the original Word documents in its open/close round trip. That is: having Apose.Words load an existing Word document into its DOM model in memory and immediately export the DOM model back to Word document. Aspose.Words will generate a lossless copy of the document, compared to the original one. This feature is crucial to our application and no other product – commercial or public domain – claims to provide that feature as far as we know.
We would like to share two screenshots to conclude this review. One screenshot (http://s26.postimg.org/lfc1skz8n/screenshot_rtf.jpg) is a complex table generated by Aspose.Words for us. The other (http://s26.postimg.org/5v4o21p47/screenshot_converted.jpg) is some contents (converted from a Word document by Aspose.Words) displayed in our online editor.