OEH
OEH

Reputation: 705

How to get whole data from Solr

I have to write some logic in Java which should retrieve all the index data from Solr.

As of now I am doing it like this

        String confSolrUrl = "http://localhost/solr/master/select?q=*%3A*&wt=json&indent=true"
        LOG.info(confSolrUrl);
        url = new URL(confSolrUrl);
        URLConnection conn = url.openConnection();

        BufferedReader br = new BufferedReader(new InputStreamReader(conn.getInputStream()));

        String inputLine;

        //save to this filename
        String fileName = "/qwertyuiop.html";
        File file = new File(fileName);

        if (!file.exists())
        {
            file.createNewFile();
        }

        FileWriter fw = new FileWriter(file.getAbsoluteFile());
        BufferedWriter bw = new BufferedWriter(fw);

        while ((inputLine = br.readLine()) != null) {
            bw.write(inputLine);
        }

        bw.close();
        br.close();

        System.out.println("Done");

In my file I will get the whole HTML file that I can parse and extract my JSON.

Is there any better way to do it? Instead of get the resource from the url and parse it?

Upvotes: 1

Views: 3827

Answers (1)

freedev
freedev

Reputation: 30037

I just wrote an application to do this, take a look at github: https://github.com/freedev/solr-import-export-json

If you want read all data from a solr collection the first problem you're facing is the pagination, in this case we are talking of deep paging.

A direct http request like you did will return a relative short amount of documents. And you can even have millions or billions of documents in a solr collection. So you should use the correct API, i.e. Solrj.

In my project I just did it.

I would also suggest this reading: https://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/

Upvotes: 3

Related Questions