Reputation: 367

Efficient data structure for my particular problem

I have a priority queue which contains a sorted matrix A together with its (row,col) location in the matrix. So I have a data structure "queue_element" which has three fields, queue_element.value, queue_element.row and queue_element.col

I need to process the priority queue and for each element that I pop, I need to go back to the location A(row,col) and eliminate all elements lying on the same row# and col#.

So for example, if the top of the priority queue contains element.value = 0.99, element.row=4 and element.col=20, then I need to zero out A(4,:) and A(:,20), where the ':' notation stands for "all" or the entire range from 0 to end. This is MATLAB/Python notation but I'm using C++ for this.

I'm currently using a separate boolean matrix P(x,y) for the elimination. For every (x,y) location that is popped from the priority queue, I just loop over all the row/col and set a false value for everything that lies on the same row/col. Then when I pop an element from the queue, I just check if it is has been set to false in P(x,y).

However, this is quite inefficient. The base case of my algorithm requires just eliminating the entire row/col.

EDIT: Sorry for missing the last part of my question...

The more advanced version of this algorithm requires eliminating everything on the same row/col but except for the nearest 4 neighbors. Another version requires eliminating everything in the upper right and the lower left quadrant of the matrix when an X-Y grid is placed at the (row,col) position.

My question basically, is whether there is an obvious data structure that I could use. It would have to a) allow me to efficiently "eliminate" or zero out the elements that I need and b) enable me to check for set membership very fast.

Any ideas on how I should go about this?

Upvotes: 1

Answers (2)

the guy formerly known as d

Reputation: 614

For your "another version", I would take a different tack. Since there can be at most one element per row, locate the "best" element in each row – the elements that are not going to be eliminated are a subset of these. Now just examine all pairs and figure out which ones survived. If you use columns instead of rows when rows outnumber columns, then this is asymptotically as fast as creating the matrix in the first place.

EDIT: I don't feel as though I'm getting the whole story because the current algorithm for the other cases runs as fast as creating the matrix. Are there incremental updates or something?

Upvotes: 0

user244343

Reputation:

The two quadrant elimination thing gives me an idea. Have you heard of a quadtree?

A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes.

I'm not sure in which order your elements are placed, but if you can figure out a way to set perhaps a priority range for each quadrant (e.g lower-right quadrant is priority 1 through N, lower-left is N+1 through M, etc), and write a tree rebalancing method which lays everything out in that sort of data structure based on priority.. then checking set membership will indeed be fast (you eliminate 3/4 of the search space for every search iteration), and you should be able to prune quadrants from the tree pretty well. A single row or column would be hell to prune in a quadtree though, with elements crossing two of the major quadrants and umpteen minor ones.

What you could do is construct a temporary quadtree using the tree rebalancing method I talked about, and then prune the two quadrants and stuff the remaining elements into a new matrix. And then just use the matrix structure to zero out elements when it comes to a row/column elimination. In both instances, the pruning itself becomes trivial, but the quadtree construction could be so much of an overhead that this is inefficient.

Another way to eliminate a quadrant of a matrix is to just have nested loops running from [row,col] to [row,0], then [row-1,col] to [row-1,0], and zero every element out that way?

EDIT: Okay, so, to construct a quadtree you're going to have to place each element in it, so you're going to need to call a (recursive) function to connect up a bunch of pointers between various elements. To zero out a value here, we have a NULL pointer, and thus no child. I haven't personally worked with a quadtree, so all I should offer is links to other implementations.

I have my own ideas for implementing it, but they're half-formed and due to your need for scalability I'm not going to attempt to design a space-optimal quadtree in this post.

Upvotes: 1

Efficient data structure for my particular problem

Answers (2)

Related Questions