Reputation: 791
How can i index a solr server with the content of webservice.
my webservice output looks like this
now i want to index the solr serverwith the content under xml as shown above
how can i index thiss into apache solr.
Upvotes: 1
Views: 1399
Reputation: 23098
Make a script in your favorite scripting language (Python for me). I did something similar with databases and hope a similar solution will go well fro you.
With Python:
And run this script periodically like a cron-job.
You will need two pieces of code: one to query your RESTful service and acquire the body of the response; the other to upload a formatted document to Solr.
This piece of code uploads a Python object request_obj to the given request_url and a solr's response is returned as a Python object. A native Python object (composed of dictionaries (associative arrays), lists, strings, numbers) translates to JSON easily (with 1-2 caveats).
Use this only as reference. I guarantee no suitablity for your purpose.
Dont forget to use /update/json?wt=python which is available from Solr 3.3 onwards. You need MultipartPostHandler library.
def solr_interface(self,request_url,request_obj):
request=json.dumps(request_obj,indent=4,encoding="cp1252")
opener = urllib2.build_opener(MultipartPostHandler.MultipartPostHandler)
urllib2.install_opener(opener)
req = urllib2.Request(request_url, request)
req.add_header("Content-Type", "application/json")
text_response = urllib2.urlopen(req).read().strip()
return ast.literal_eval(text_response)
As for parsing (and composing) XML in Python, use these excellent tutorials http://www.learningpython.com/2008/05/07/elegant-xml-parsing-using-the-elementtree-module/ and http://effbot.org/zone/element.htm
This is a commandline sample.
from xml.etree import ElementTree as ET
elem =ET.fromstring("<doc><p>This is a block</p><p>This is another block</p></doc>")
for subelement in elem:
... print subelement.text
...
This is a block
This is another block
Upvotes: 1
Reputation: 22555
You will need to loosely follow the steps below to index your data.
Upvotes: 1