Reputation: 1115
I am using tika
for extracting text from pdf in python
. But, it downloads the .jar on every run. which is time consuming.
[MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.19/tika-server-1.19.jar to /tmp/tika-server.jar.
This happens every time I run the code. Is there a way to manually do it once and stop tika
to do it everytime?
Upvotes: 1
Views: 1584
Reputation: 56
I know it´s been a while and you probably figured something out already, but for others like me still looking for solution I would like to sugest other topic in wich the guy who asks the question presentes his own functional aproach.
Moreover, I noticed that tika demands internet access only at the very first run, so, if you manage to deny internet access for it after setting everything up, it won´t waste time downloading new files.
Upvotes: 4