Grouping an ordered dataset into minimal number of clusters

Question

I have an ordered list of weighted items, weight of each is less-or-equal than N. I need to convert it into a list of clusters. Each cluster should span several consecutive items, and total weight of a cluster has to be less-or-equal than N.

Is there an algorithm which does it while minimizing the total number of clusters and keeping their weights as even as possible?

E.g. list [(a,5),(b,1),(c,2),(d,5)], N=6 should be converted into [([a],5),([b,c],3),([d],5)]

user382751 · Accepted Answer

Since the dataset is ordered, one possible approach is to assign a "badness" score to each possible cluster and use a dynamic program reminiscent of Knuth's word wrapping ( http://en.wikipedia.org/wiki/Word_wrap ) to minimize the sum of the badness scores. The badness function will let you explore tradeoffs between minimizing the number of clusters (larger constant term) and balancing them (larger penalty for deviating from the average number of items).

Grouping an ordered dataset into minimal number of clusters

Answers (2)

Related Questions