Madhusudan
Madhusudan

Reputation: 465

Can we search for .txt files in Solr search engine?

I am using solr search engine for my project purpose in document retrival. My dataset is in .txt file format. But solr gives options for json,xml,pdf and some other file formats only. There is no option for text files.
Do I need some modifications in solr for using .txt files as dataset?

Upvotes: 1

Views: 1410

Answers (5)

Nate
Nate

Reputation: 2193

I found a very useful line in the quickstart guide https://lucene.apache.org/solr/5_3_1/quickstart.html

java -classpath /solr-5.0.0/dist/solr-core-5.0.0.jar -Dauto=yes
-Dc=gettingstarted -Ddata=files -Drecursive=yes org.apache.solr.util.SimplePostTool docs/

The part that is especially useful for me is -Dauto=yes. When this option is turned on, Solr can handle many type of files (don't ask me why)

Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log

All I know is that I turned that option on, and now my instance will accept pdf, xml and txt files.

Upvotes: 0

Marty
Marty

Reputation: 73

You can use the CSV request Handler to take care of this. https://wiki.apache.org/solr/UpdateCSV Here, you can configure the delimiters and escape characters. For eg: if you have a "|" delimited file, you can specify "&separator=|"

Below is for Indexing a tab limited text file:

curl 'http://localhost:8983/solr/update/csv?commit=true&separator=%09&escape=\&stream.file=/tmp/result.txt'

Upvotes: 0

Jayesh Chandrapal
Jayesh Chandrapal

Reputation: 684

Apart from txt files, Solr can also index several other document formats. Take a look at Apache Tika for details.

Upvotes: 0

javacreed
javacreed

Reputation: 968

Most probably you will be having space separated documents in .txt files.So to index .txt file you can write python script to stream your documents to solr and perform a commit.

Upvotes: 0

Mysterion
Mysterion

Reputation: 9320

All you need to do - is to index your txt file.

For more info and concrete examples take a look here - http://www.slideshare.net/LucidImagination/indexing-text-and-html-files-with-solr-4063407

Upvotes: 0

Related Questions