Ankita Gulati
Ankita Gulati

Reputation: 11

How to fetch all data of solr which contains 40k rows into csv?

import pandas as pd
import pysolr
solrcon = pysolr.Solr('...', timeout=10)
results = solrcon.search('*:*')
docs = pd.DataFrame(results.docs)
docs

But only able to fetch 10 rows or max limit is 100 rows . How to fetch all rows ? I am using pysolr version 3.8.1

Upvotes: 1

Views: 1398

Answers (1)

EricLavault
EricLavault

Reputation: 16095

Use the rows parameter :

You can use the rows parameter to paginate results from a query. The parameter specifies the maximum number of documents from the complete result set that Solr should return to the client at one time.

The default value is 10. That is, by default, Solr returns 10 documents at a time in response to a query.

Passing additional options to Solr using pysolr, using fl as well for the example (list of fields to include in the response) because you might need to restrict this list to keep a decent response time :

results = solrcon.search('*:*', **{
    'rows': 100000,
    'fl': 'id, title, score' 
})

Upvotes: 4

Related Questions