add-semi-colons
add-semi-colons

Reputation: 18830

Building a solr index using large text file

I have a large text file in following format:

00001,234234|234|235|7345
00005,788|298|234|735

You can treat values prior to , as keys and what I want to do is quick and dirty approach to query these keys and find the results sets for each key. After reading a bit I found out that solr provide a good framework to do this.

Upvotes: 0

Views: 659

Answers (2)

user95338
user95338

Reputation: 76

You can index your data directly in Solr using the UpdateCSV handler: You just need to specify the destination field names in the fieldnames parameter in your curl call (or add them as the first line in your file if that is easier). No custom code needed.

Do remember to check that the destination field for the |-separated values splits into tokens using that characters.

See https://wiki.apache.org/solr/UpdateCSV for details.

Upvotes: 1

ajaanbaahu
ajaanbaahu

Reputation: 344

You can definitely do that using pysolr which is a python library. If the data is in key value form you can read it in python like shown here : https://pypi.python.org/pypi/pysolr/3.1.0

To have more control on search you need to modify the schema.xml file to have the keys as you have in your text file.

Once you have the data ingested in SOLR you can follow the above link to perform search.

Upvotes: 1

Related Questions