JavaBits
JavaBits

Reputation: 2055

Faster SQL data retrival with Java and search large data

I have a table with over 100 thousand data consisting of number pairs. A sample of which is shown below.

   A          B
   0010     0010 
    0010     0011 
    0010     0019 
    0010     0056
    0011     0010 
    0011     0011 
    0011     0019 
    0011     0040
    0019     0010
    0019     0058 

Here the numbers in Column A has possible pairs present in column B. Explanation : User will have several of these numbers ranging form 10 -100. Now as we can see for 0010 - 0011 and 0019 is present. So if the user has a list containing 0010 along with 0011 a warning will be shown that this pair is not allowed and vice versa.

How to approach this in Java? Loading the hash map with all the data doesnot seem to be a good option although the search will be much faster.

please suggest. Thanks

Testcase:

num = 0010;  //value from list which user will be passing
test(num){
if(num.equals("0019")||num.equals("0011")) //compairing with database 
System.out.println("incompatible pair present");
}

The above example is a very simple pseudo code. The actual problem will me much more complex.

Upvotes: 1

Views: 272

Answers (7)

Fabian Barney
Fabian Barney

Reputation: 14549

To clarify the question: You have a table containing two numbers each record which are declared 'incompatible'. You have a user list of numbers and you want to check if this list contains 'incompatible numbers'. Right?

Here you go with a simple SQL (took your example from comment):

SELECT *
FROM   incompatible
WHERE  A IN (1, 14, 67) AND B IN (1, 14, 67);

This SQL returns all incompatibilities. When the resultset is empty then there are no incompatibilities and everything is fine. If you only want to retrieve this fact then you can write SELECT 1 ... instead. The SQL have to be build dynamically to contain the user's numbers in the IN clauses, of course.

To speed up queries you can create an (unique) index over both columns. So the database can do a index range scan (unique). If this table does not yet contain a primary key then you should create a primary key over both columns.

Upvotes: 0

James Anderson
James Anderson

Reputation: 27478

Simple - definitely not scalable solution -- if your ranges really are 0000 - 9999.

Simply have a byte table with 999999 entries. Each entry consists of a simple 0 for allowed or 1 for not allowed. You find an entry in the table by logically concatenating the two pair numbers (key = first * 1000 + second).

The more scalable database solution is to create a table with a composite primary key (pair1 and pair2) the mere presence of an entry indicating a disallowed pair.

Upvotes: 0

Andrew Lazarus
Andrew Lazarus

Reputation: 19340

If I have understood correctly what you want to do…

Create a unique index on t1(a,b). Put the user's new pair in an INSERT statement inside a try block. Catch key violation exceptions (will be aSQLException, possibly a subclass depending on your RDBMS) and explain to the user that is a forbidden pair.

Upvotes: 0

Fabian Barney
Fabian Barney

Reputation: 14549

Until the question is more clear... Handling large amounts of data which are already stored in a database let me give you a recommendation: Whatever you want to do here, consider solving it with SQL instead of Java. Or at least write a SQL with an resulting ResultSet which is easy to evaluate in Java afterwards.

But until the question is not that clear ...

Upvotes: 2

RonK
RonK

Reputation: 9652

Seems like your problem is limited to a very small domain - why cant you instantiate an two dimensional array of bool and set it to true whenever the indexes of two numbers creates an unsupported combination.

Example for usage:

if (forbiden[10][11] || forbiden[11][10]) 
{
   throw new Exception("pairs of '10' and '11' are not allowed");
}

You can instantiate this array from the database by going over the data once and setting this array. You just need to translate 0010 to 10. You will have junk in Indexes 0-9, but you can eliminate it by "translating" the index by subtracting it from 9.

Does that hit your question?

Upvotes: 0

Peter Lawrey
Peter Lawrey

Reputation: 533660

Are you trying to find entries where A is the same value but B is different?

SELECT t1.a, t1.b, t2.b 
FROM MyTable t1, MyTable t2 
WHERE t1.a = t2.b AND t1.b <> t2.b

Upvotes: 1

qwerty
qwerty

Reputation: 3869

If you're worried of running out of heap space, you could try using a persistent cache like ehcache. I suggest you check the actual memory consumed before going in for this solution though

Upvotes: 0

Related Questions