Reputation: 705
I have to write some logic in Java which should retrieve all the index data from Solr.
As of now I am doing it like this
String confSolrUrl = "http://localhost/solr/master/select?q=*%3A*&wt=json&indent=true"
LOG.info(confSolrUrl);
url = new URL(confSolrUrl);
URLConnection conn = url.openConnection();
BufferedReader br = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String inputLine;
//save to this filename
String fileName = "/qwertyuiop.html";
File file = new File(fileName);
if (!file.exists())
{
file.createNewFile();
}
FileWriter fw = new FileWriter(file.getAbsoluteFile());
BufferedWriter bw = new BufferedWriter(fw);
while ((inputLine = br.readLine()) != null) {
bw.write(inputLine);
}
bw.close();
br.close();
System.out.println("Done");
In my file I will get the whole HTML
file that I can parse and extract my JSON
.
Is there any better way to do it? Instead of get the resource from the url and parse it?
Upvotes: 1
Views: 3827
Reputation: 30037
I just wrote an application to do this, take a look at github: https://github.com/freedev/solr-import-export-json
If you want read all data from a solr collection the first problem you're facing is the pagination, in this case we are talking of deep paging.
A direct http request like you did will return a relative short amount of documents. And you can even have millions or billions of documents in a solr collection. So you should use the correct API, i.e. Solrj.
In my project I just did it.
I would also suggest this reading: https://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/
Upvotes: 3