Reputation: 788
I have a Solr index with a schema that looks like this:
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"q.op": "OR",
"_": "1673422604341"
}
},
"response": {
"numFound": 1206,
"start": 0,
"numFoundExact": true,
"docs": [
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName1",
"price_per_lb_value_f": 1.11,
"received_date_dt": "2015-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName2",
"price_per_lb_value_f": 2.22,
"received_date_dt": "2020-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName3",
"price_per_lb_value_f": 3.33,
"received_date_dt": "2021-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName1",
"price_per_lb_value_f": 4.44,
"received_date_dt": "2016-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName2",
"price_per_lb_value_f": 5.55,
"received_date_dt": "2021-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName3",
"price_per_lb_value_f": 6.66,
"received_date_dt": "2022-01-01T00:00:00Z"
}
]
}
}
These are historical prices for different materials from different companies.
I would like to get the lowest price_per_lb_value_f
for each material_name_s
in last 2 years, so the results would look like this:
{
"response": {
"numFound": 2,
"start": 0,
"numFoundExact": true,
"docs": [
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName3",
"price_per_lb_value_f": 3.33,
"received_date_dt": "2021-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName2",
"price_per_lb_value_f": 5.55,
"received_date_dt": "2021-01-01T00:00:00Z"
}
]
}
}
Is this kind of grouping is even possible to do with Solr? I'm a newbie to Solr, so any help would be appreciated.
Upvotes: 0
Views: 138
Reputation: 546
grouping is possible in Solr. You can get the result you want with the following queries:
http://localhost:8983/solr/test/select?indent=true&q.op=OR&q=received_date_dt:[NOW-3YEAR%20TO%20*]&fq={!collapse%20field=material_name_s%20min=price_per_lb_value_f}
q:received_date_dt:[NOW-3YEAR TO *]
// Range query to filter only the documents received in the last 3 years otherwise I wouldn't get documents received on 2021-01-01
fq:{!collapse field=material_name_s min=price_per_lb_value_f}
// It shows only one document within all documents with the same value of material_name_s
. It gets the document with the min price_per_lb_value_f
http://localhost:8983/solr/test/select?indent=true&q.op=OR&q=received_date_dt:[NOW-3YEAR%20TO%20*]&group=true&group.field=material_name_s&group.sort=price_per_lb_value_f%20asc
q:received_date_dt:[NOW-3YEAR TO *]
// same filter as before
group:true
// enable grouping
group.field:material_name_s
// groups by material_name_s
group.sort:price_per_lb_value_f asc
// sort each group by the field price_per_lb_value_f
in ascending order
group.limit
not specified as the default value is 1 // it sets the number of results for each group
Upvotes: 1