Number of binary search trees over n elements

Question

I need to find a number of different binary search trees over n elements. I know, that the number of binary search trees over n distinct elements is Catalan number. But what if numbers can repeat? In Binary Search Tree elements of left subtree are less than root, and elements of right subtree are equal or greater than root.

This is my code for calculating a number of different binary search trees over n elements. I calculate this number by modulo 123456789, as it can be really big. Can I simply change it to solve my problem?

#include 
#include 
#include 
#include 

using std::vector;
using std::cin;
using std::cout;
using std::endl;


long long countTreeDP(int nodeNumber) 
{
    vector  > cNumbers;
    for (int i = 0; i < nodeNumber + 2; ++i)
    {
        vector cur(nodeNumber + 2, 0);
        cNumbers.push_back(cur);
    }
    for (int i = 0; i < nodeNumber + 2; ++i)
    {
        cNumbers[i][0] = 1;
    }

    long long sum = 0;
    for (int i = 1; i < nodeNumber + 1; ++i) 
    {
        sum = 0;
        for (int j = 1; j <= i; ++j) 
        {
            long long numFirst = cNumbers[nodeNumber + 1][i - j] % 123456789;
            long long numSecond = cNumbers[nodeNumber + 1][j - 1] % 123456789;
            cNumbers[j][i] = (numFirst * numSecond) % 123456789;
            sum += cNumbers[j][i];
            sum = sum % 123456789;
        }
        cNumbers[nodeNumber+1][i] = sum;
    }

    return cNumbers[nodeNumber+1][nodeNumber];
}

int main()
{
    vector nodesValues;
    int size = 0;
    cin >> size;

    for (int i = 0; i < size; ++i)
    {
        long long elem;
        cin >> elem;
        nodesValues.push_back(elem);        
    }

    std::sort(nodesValues.begin(), nodesValues.end());

    int uniqueCount = 0;

    for (int i = 1; i < nodesValues.size(); ++i)
    {
        if (nodesValues[i] != nodesValues[i-1])
            ++uniqueCount;
    }
    cout << countTreeDP(uniqueCount + 1) << endl;
    system("pause");
    return 0;
}

MvG · Accepted Answer

I'm assuming that a[i] for 0 ≤ i < n is your sorted array of node values. Here is what I'd do: create a two-dimensional table T such that entry T[x][y] counts the number of search trees for the subsequence x ≤ i < y. You'd initialize it by setting all T[i][i] for 0 ≤ i ≤ n to 1, since the empty sequence has only one search tree, the empty tree.

Now you'd write three nested loops. The outermost loop iterates over sequence length, the middle iterates over possible starting positions, and the innermost iterates over possible tree nodes. Something like

for (int len = 1; len <= n; ++len) {
  for (int pos = 0; pos <= n - len; ++pos) {
    int left = pos, right = pos + len;
    long long cnt = 0;
    for (int root = left; root < right; ++root) {
      if (root > left && a[root] == a[root - 1])
        continue; // duplicate element may not be root
      cnt += T[left][root]*T[root + 1][right];
    }
  }
}

The above would work without the if for the case of distinct elements. If you can have duplicate elements, then an element can only be considered a root of a subtree (according to your definition) if the element before it is strictly less than the element itself, or if it is the first element of the subtree. So the check above skips invalid roots, so in the end T[0][n] would hold your desired count.

I'll leave the modulo arithemtic to you. I'll also leave it to you to compare this idea to the code you wrote yourself, since I don't feel like reading that much uncommented code just now.

Number of binary search trees over n elements

Answers (1)

Related Questions