The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries.
I am trying to create a simple java program which reads and extracts the content from the file(s) inside …
java zip extract apache-tikaCan anyone point me to a tutorial. My main experience with Solr is indexing CSV files. But I cannot find …
solr full-text-search solrj apache-tika solr-cellIm running Solr 1.4 on Ubuntu 10.04 (installed via apt-get solr-tomcat) and it seems to be working fine. Im having some difficulty …
solr full-text-search apache-tika solr-cellI am uploading files to an Amazon s3 bucket and have access to the InputStream and a String containing the …
java amazon-s3 apache-tikaI'm using Apache Tika, and I have files (without extension) of particular content type that need to be renamed to …
java content-type apache-tikaI'd need to get the iana.org MediaType rather than application/zip or application/x-tika-msoffice for documents like, odt, ppt, …
java mime-types detection apache-tikaFor this link http://bits.blogs.nytimes.com/2014/09/02/uber-banned-across-germany-by-frankfurt-court/?partner=rss&emc=rss this code doesn`t work but …
java url apache-tikaOn Tika's website it says (concerning tika-app-1.2.jar) it can be used in server mode. Does anyone know how to …
apache-tikaI download tika-core and tika-parser libraries, but I could not find the example codes to parse HTML documents to string. …
java html apache apache-tikaI tried converting .doc to HTML by using WordToHtmlConverter and it worked perfectly. But when i tried to convert .docx …
java apache-tika