Reputation: 8961
With a CouchDB reduce function:
function(keys, values, rereduce) {
// ...
}
That gets called like this:
reduce( [[key1,id1], [key2,id2], [key3,id3]], [value1,value2,value3], false )
Question 1
What is the reason for passing keys to the reduce function? I have only written relatively simple CouchDB views with reduce functions and would like to know what the use case is for receiving a list of [key1, docid], [key2, docid], etc
is.
Also. is there ever a time when key1 != key2 != keyX
when a reduce function executes?
Question 2
CouchDB's implementation of MapReduce allows for rereduce=true
, in which case the reduce function is called like this:
reduce(null, [intermediate1,intermediate2,intermediate3], true)
Where the keys argument is null
(unlike when rereduce=false
). Why would there not be a use case for a keys
argument in this case if there was a use for when rereduce=false
?
Upvotes: 4
Views: 2856
Reputation: 79516
What is the use case of
keys
argument whenrereduce = true
?
There isn't one. That's why the keys
argument is null in this case.
From the documentation (emphasis added):
Reduce and Rereduce Functions
redfun(keys, values[, rereduce])
Arguments:
keys
– Array of pairs of key-docid for related map function results. Alwaysnull
if rereduce is running (hastrue
value).values
– Array of map function result values.rereduce
– Boolean flag to indicate a rereduce run.
Perhaps what you're meaning to ask is: Why is the same function used for both reduce
and rereduce
? I expect there's some history involved, but I can also imagine that it's because it's quite common that the same logic can be used for both functions, and by not having separate function definitions duplication can be reduced. Suppose a simple sum
reduce function:
function(keys, values) {
return sum(values);
}
Here both keys
and rereduce
can be ignored entirely. Many other (re)reduce functions follow the same pattern. If two functions had to be used, then this identical function would have to be specified twice.
In response to the additional question in comments:
what use cases exist for the keys argument when
rereduce=false
?
Remember, keys
and values
can be anything, based on the map function. A common pattern is to emit([foo,bar,baz],null)
. That is to say, the value may be null, if all the data you care about is already present in the key. In such a case, any reduce function more complex than a simple sum
would require use of the keys.
Further, for grouping operations, using the keys makes sense. Consider a map function with emit(doc.countryCode, ... )
, a possible (incomplete) reduce function:
function(keys, values, rereduce) {
const sums = {};
if (!rereduce) {
keys.forEach((key) => ++sums[key]);
}
return sums;
}
Then given documents:
{"countryCode": "us", ...}
{"countryCode": "us", ...}
{"countryCode": "br", ...}
You'd get emitted values (from the map function) of:
["us", ...]
["br", ...]
You'd a reduced result of:
{"us": 2, "br": 1}
Upvotes: 2