Ann  Orlova
Ann Orlova

Reputation: 1348

Number of binary search trees over n elements

I need to find a number of different binary search trees over n elements. I know, that the number of binary search trees over n distinct elements is Catalan number. But what if numbers can repeat? In Binary Search Tree elements of left subtree are less than root, and elements of right subtree are equal or greater than root.

This is my code for calculating a number of different binary search trees over n elements. I calculate this number by modulo 123456789, as it can be really big. Can I simply change it to solve my problem?

#include <algorithm>
#include <vector>
#include <iostream>
#include <limits>

using std::vector;
using std::cin;
using std::cout;
using std::endl;


long long countTreeDP(int nodeNumber) 
{
    vector <vector<long long> > cNumbers;
    for (int i = 0; i < nodeNumber + 2; ++i)
    {
        vector<long long> cur(nodeNumber + 2, 0);
        cNumbers.push_back(cur);
    }
    for (int i = 0; i < nodeNumber + 2; ++i)
    {
        cNumbers[i][0] = 1;
    }

    long long sum = 0;
    for (int i = 1; i < nodeNumber + 1; ++i) 
    {
        sum = 0;
        for (int j = 1; j <= i; ++j) 
        {
            long long numFirst = cNumbers[nodeNumber + 1][i - j] % 123456789;
            long long numSecond = cNumbers[nodeNumber + 1][j - 1] % 123456789;
            cNumbers[j][i] = (numFirst * numSecond) % 123456789;
            sum += cNumbers[j][i];
            sum = sum % 123456789;
        }
        cNumbers[nodeNumber+1][i] = sum;
    }

    return cNumbers[nodeNumber+1][nodeNumber];
}

int main()
{
    vector<long long> nodesValues;
    int size = 0;
    cin >> size;

    for (int i = 0; i < size; ++i)
    {
        long long elem;
        cin >> elem;
        nodesValues.push_back(elem);        
    }

    std::sort(nodesValues.begin(), nodesValues.end());

    int uniqueCount = 0;

    for (int i = 1; i < nodesValues.size(); ++i)
    {
        if (nodesValues[i] != nodesValues[i-1])
            ++uniqueCount;
    }
    cout << countTreeDP(uniqueCount + 1) << endl;
    system("pause");
    return 0;
}

Upvotes: 2

Views: 593

Answers (1)

MvG
MvG

Reputation: 60858

I'm assuming that a[i] for 0 ≤ i < n is your sorted array of node values. Here is what I'd do: create a two-dimensional table T such that entry T[x][y] counts the number of search trees for the subsequence x ≤ i < y. You'd initialize it by setting all T[i][i] for 0 ≤ i ≤ n to 1, since the empty sequence has only one search tree, the empty tree.

Now you'd write three nested loops. The outermost loop iterates over sequence length, the middle iterates over possible starting positions, and the innermost iterates over possible tree nodes. Something like

for (int len = 1; len <= n; ++len) {
  for (int pos = 0; pos <= n - len; ++pos) {
    int left = pos, right = pos + len;
    long long cnt = 0;
    for (int root = left; root < right; ++root) {
      if (root > left && a[root] == a[root - 1])
        continue; // duplicate element may not be root
      cnt += T[left][root]*T[root + 1][right];
    }
  }
}

The above would work without the if for the case of distinct elements. If you can have duplicate elements, then an element can only be considered a root of a subtree (according to your definition) if the element before it is strictly less than the element itself, or if it is the first element of the subtree. So the check above skips invalid roots, so in the end T[0][n] would hold your desired count.

I'll leave the modulo arithemtic to you. I'll also leave it to you to compare this idea to the code you wrote yourself, since I don't feel like reading that much uncommented code just now.

Upvotes: 4

Related Questions