user10207601
user10207601

Reputation:

Appropriate data structure for add and find queries

I have two types of queries.

1 X Y

Add element X ,Y times in the collection.

2 N

Number of queries < 5 * 10^5

X < 10^9

Y < 10^9

Find Nth element in the sorted collection.

I tried STL set but it did not work.

I think we need balanced tree with each node containing two data values.

First value will be element X. And another will be prefix sum of all the Ys of elements smaller than or equal to value.

When we are adding element X find preprocessor of that first value.Add second value associated with preprocessor to Y.

When finding Nth element. Search in tree(second value) for value immediately lower than N.

How to efficiently implement this data structure ?

Upvotes: 2

Views: 188

Answers (2)

Photon
Photon

Reputation: 2777

This can easily be done using segment tree data structure with complexity of O(Q*log(10^9))

  1. We should use so called "sparse" segment tree so that we only create nodes when needed, instead of creating all nodes.
  2. In every node we will save count of elements in range [L, R]
  3. Now additions of some element y times can easily be done by traversing segment tree from root to leaf and updating the values (also creating nodes that do not exist yet). Since the height of segment tree is logarithmic this takes log N time where N is our initial interval length (10^9)
  4. Finding k-th element can easily be done using binary search on segment tree, since on every node we know the count of elements in some range, we can use this information to traverse left or right to the element which contains the k-th

Sample code (C++):

#include <bits/stdc++.h>
using namespace std;
#define ll long long

const int sz = 31*4*5*100000;
ll seg[sz];
int L[sz],R[sz];
int nxt = 2;

void IncNode(int c, int l, int r, int idx, int val)
{
    if(l==r)
    {
        seg[c]+=val;
        return;
    }

    int m = (l+r)/2;

    if(idx <= m)
    {
        if(!L[c])L[c]=nxt++;
        IncNode(L[c],l,m,idx,val);
    }
    else
    {
        if(!R[c])R[c]=nxt++;
        IncNode(R[c],m+1,r,idx,val);
    }

    seg[c] = seg[L[c]] + seg[R[c]];
}

int FindKth(int c, int l, int r, ll k)
{
    if(l==r)return r;

    int m = (l+r)/2;

    if(seg[L[c]] >= k)return FindKth(L[c],l,m,k);
    return FindKth(R[c],m+1,r,k-seg[L[c]]);
}

int main()
{
    ios::sync_with_stdio(0);cin.tie(0);cout.tie(0);
    int Q;
    cin>>Q;
    int L = 0, R = 1e9;

    while(Q--)
    {
        int type;
        cin>>type;

        if(type==1)
        {
            int x,y;
            cin>>x>>y;
            IncNode(1,L,R,x,y);
        }
        else
        {
            int k;
            cin>>k;
            cout<<FindKth(1,L,R,k)<<"\n";
        }
    }
}

Upvotes: 1

Brian Bi
Brian Bi

Reputation: 119572

Maintaining a prefix sum in each node is not practical. It would mean that every time you add a new node, you have to update the prefix sum in every node succeeding it in the tree. Instead, you need to maintain subtree sums: each node should contain the sum of Y-values for its own key and the keys of all descendants. Maintaining subtree sums when the tree is updated should be straightforward.

When you answer a query of type 2, at each node, you would descend into the left subtree if N is less than or equal to the subtree sum value S of the left child (I'm assuming N is 1-indexed). Otherwise, subtract S + 1 from N and descend into the right subtree.

By the way, if the entire set of X values is known in advance, then instead of a balanced BST, you could use a range tree or a binary indexed tree.

Upvotes: 0

Related Questions