dave
dave

Reputation: 209

Selecting K nodes out of a tree with N nodes, with minimum cost

There's an tree with N nodes and N-1 edges (making it a tree). Each node has a weight W(i). How can you select a subtree of size K nodes that still includes the root of the original tree? I have to do this so it minimizes the "cost" of selecting this "subtree", where cost is defined as the sum of the weights of all edges kept.

I think I've done a problem like this before, and it seems like DP/recursion. However, I know how to deal with it when it's limited to 2 children per node. You'd defined a function cost(n, i) which means the minimum cost of keeping i nodes starting at node n. You'd iterate from i = 0 to n in one of the children, and give the rest to the other child. However, since each node can have an unlimited amount of children, is there a way to deal with this?

Thanks

Upvotes: 1

Views: 1219

Answers (2)

nhahtdh
nhahtdh

Reputation: 56819

DP solution (may or may not be similar to what mcdowella described):

Let the root node be numbered 0. The rest of the nodes can be numbered arbitrarily (but the best way is to number the node one by one by level).

f(i, j) denotes the minimum cost when we have considered all sibling nodes that has index larger than i plus any children nodes if applicable, and we pick j nodes out of valid nodes.

f(i, 0) = 0            // i can be anything, even NOT_FOUND
f(NOT_FOUND, j) = +Inf // j > 0
f(i, j) = min(f(x, j), // Node i not chosen
              min[r = 0 to j - 1] (f(x, j - 1 - r) + f(y, r)) + cost(i))
              // Node i is chosen, then we can pick some elements from 
              // children node, or from untouched sibling nodes + descendants

where

  • x is the smallest index that is larger than i, and node x and i are sibling nodes
  • y is the smallest index, where node y is children of node i.
  • NOT_FOUND is assigned to x and y if index not found.

Result will be in f(0, K).

Upvotes: 0

mcdowella
mcdowella

Reputation: 19611

For a given node, you want to calculate cost(n, i) using cost(n, i) for its children. Number the children from 0..K and you have another dynamic program. At stage j you want to work out the best possible set of cost(n, i) using only children 0..j.

I've been thinking about coding something like this while mucking around with machine learning (Just for fun I want to write programs that look for anomalies by fitting two Weka classifiers instead of one). Is that anything like your purpose?

Upvotes: 2

Related Questions