Apache Tika core

Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.

Homepage POM file JAR file Javadoc
'org.apache.tika:tika-core:0.4'

Dependencies

Test dependencies