aliakbarian
aliakbarian

Reputation: 769

How to integrate org.apache.tika source to my project?

I have downloaded the Apache Tika source folder, and I have installed Maven. Then by command line (mvn install) I have installed Tika:

[INFO] Reactor Summary:
[INFO] ------------------------------------------------------------------------
[INFO] Apache Tika parent .................................... SUCCESS [4:20.656s]
[INFO] Apache Tika core ...................................... SUCCESS [2:26.466s]
[INFO] Apache Tika parsers ................................... SUCCESS [3:27.711s]
[INFO] Apache Tika application ............................... SUCCESS [1:23.548s]
[INFO] Apache Tika OSGi bundle ............................... SUCCESS [3:34.223s]
[INFO] Apache Tika ........................................... SUCCESS [6.217s]
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL

But now I don't know what is the next step to use Tika in my project. Actually I dont know how to add Tika in my project.

The reason that I used Tika source instead of jar file was that I wanted to add farsi.ngp file to languageIdentification of Tika. I have added farsi.ngp and build Tika by Maven, but now I dont know what is the next step. What really I must add to my project?

Upvotes: 1

Views: 2471

Answers (1)

Tejas Patil
Tejas Patil

Reputation: 6169

I read this page and following is my suggestion:

After you have modified the code or added the .ngp file and build the code, you must get these build artifacts:

 tika-core/target/tika-core-1.0.jar
 tika-parsers/target/tika-parsers-1.0.jar

Wherever in your application you desire to use tika, ass the 2 tika jars and its dependent jars to the classpath. Example, if you are using ant in your application then add this to the build file:

<classpath>
  ... <!-- your other classpath entries -->
  <pathelement location="path/to/tika-core-1.0.jar"/>
  <pathelement location="path/to/tika-parsers-1.0.jar"/>
  <pathelement location="path/to/commons-logging-1.1.1.jar"/>
  <pathelement location="path/to/commons-compress-1.0.jar"/>
  <pathelement location="path/to/pdfbox-1.0.0-incubating.jar"/>
  <pathelement location="path/to/fontbox-1.0.0-incubator.jar"/>
  <pathelement location="path/to/jempbox-1.0.0-incubator.jar"/>
  <pathelement location="path/to/poi-3.6.jar"/>
  <pathelement location="path/to/poi-scratchpad-3.6.jar"/>
  <pathelement location="path/to/poi-ooxml-3.6.jar"/>
  <pathelement location="path/to/poi-ooxml-schemas-3.6.jar"/>
  <pathelement location="path/to/xmlbeans-2.3.0.jar"/>
  <pathelement location="path/to/dom4j-1.6.1.jar"/>
  <pathelement location="path/to/xml-apis-1.0.b2.jar"/>
  <pathelement location="path/to/geronimo-stax-api_1.0_spec-1.0.jar"/>
  <pathelement location="path/to/tagsoup-1.2.jar"/>
  <pathelement location="path/to/asm-3.1.jar"/>
  <pathelement location="path/to/log4j-1.2.14.jar"/>
  <pathelement location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
</classpath>

Hope this helps you.

Upvotes: 2

Related Questions