Reputation: 4085

thrust::reduce_by_key performance with few key repetitions

I have to do keyed reductions of arrays with many different keys that repeat only once in a while:

keys =  {1,2,3,3,4,5,6,7,7, 8, 9, 9,10,11,...}
array = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,...}

// after reduction
result = {1,2,7,5,6,7,17,10,23,13,14}

Using thrust::reduce_by_key (or any other segmented reduction method) is not the fastest option here as most operations are in fact just copies from one array to another.

What would be a better approach to this problem?

Upvotes: 4

Answers (1)

wnbell

Reputation: 1240

Actually, reduce_by_key is the appropriate algorithm to use here. It's just that the current implementation in Thrust is not as fast as it could be. To elaborate, there's nothing to prevent reduce_by_key from executing at memcpy speed, and I believe other implementations already achieve that rate. Our tentative plan for Thrust v1.7 includes making performance improvements to reduce_by_key and other scan-based algorithms using the codes in the related back40computing project.

Note that when the segments are either (1) long or of (2) uniform length then it's possible to do better than reduce_by_key. For example, at some point it's more economical to use an offset-based segment descriptor than keys or head flags. However, when the segments are short (as in your case) or of highly variable length, then an optimal reduce_by_key implementation is really the best tool for the job.

Upvotes: 4

thrust::reduce_by_key performance with few key repetitions

Answers (1)

Related Questions