Ryan Brady
Ryan Brady

Reputation: 147

What is the most efficient way to compare rows in a MySQL table with Java

This is a largely conceptual question so i dont have any code to show. I'll try to explain this as best i can. I am writing a program that is supposed to find common sequences of numbers found in a large table of random combinations.

So for example take this data:

1 5 3 9 6 3 8 8 3 3

6 7 5 5 5 4 9 2 0 1

6 4 4 3 7 8 3 9 5 6 

2 4 2 4 5 5 3 4 7 7 

1 5 6 3 4 9 9 3 3 2

0 2 7 9 4 5 3 9 8 3

These are random combinatinos of the numbers 1-9. For every 3 digit (or more) sequence found more than once i need to put that into another database. So the first row contains "5 3 9" and the 6th row also contains "5 3 9". I would put that sequence in a separate table with the number of times it was found.

I'm still working out the algorithm for actually making these comparisons but i figure i'll have to start with "1 5 3", compare that to every single 3 number trio found, then move on to "5 3 9" then "3 9 6" etc....

MY MAIN PROBLEM RIGHT NOW is that i dont know how to do this if these numbers are stored in a database. My database table has 11 columns. One column for each individual number, and one column for the 10 digit sequence as a whole. Columns are called Sequence, 1stNum, 2ndNum, 3rdNum...10thNum.

Visual: first row in my database for the data above would be this :

|  1 5 3 9 6 3 8 8 3 3  |  1  |  5  |  3  |  9  |  6  |  3  |  8  |  8  |  3  |  3  |

("|" divide columns)

How do i make comparisons efficiently with Java? I'm iterating over every row in the table many times. Once for the initial sequence to be compared, and for every one of those sequences i go through each row. Basically a for loop in a for loop. This sounds like its going to take a ton of queries and could take forever if the table gets to be massive (which it will).

Is it more computationally efficient if i iterate through a database using queries or if i dump the database and iterate through a file?

I tried to explain this as best as i could, its a very confusing process for me. I can clarify anything you need me to. I just need guidance on what the best course of action for this would be.

Upvotes: 3

Views: 1165

Answers (2)

Dici
Dici

Reputation: 25980

Here is what I would do, assuming you have retrieved the sequences in a list :

List<String> sequences = Arrays.asList("1539638833","6755549201","6443783956","2424553477","1563499332","0279453983");
Map<String,Integer> count = new HashMap<>();
for (String seq : sequences) {
    int length = seq.length();
    for (int i=0 ; i<length - 2 ; i++) {
        String sub = seq.substring(i,i + 3);
        count.put(sub,count.containsKey(sub) ? count.get(sub) + 1 : 1);
    }
}
System.out.println(count);   

Ouput :

{920=1, 783=1, 945=1, 332=1, 963=1, 644=1, 156=1, 983=1, 453=1, 153=1, 388=1, 534=1,
 455=1, 245=1, 539=2, 554=1, 242=1, 555=1, 553=1, 437=1, 883=1, 349=1, 755=1, 675=1,
 638=1, 395=1, 201=1, 956=1, 933=1, 499=1, 634=1, 839=1, 794=1, 027=1, 477=1, 833=1,
 347=1, 492=1, 378=1, 279=1, 993=1, 443=1, 396=1, 398=1, 549=1, 563=1, 424=1}

You can then store these values in the database from the Map.

Upvotes: 1

user17130
user17130

Reputation: 261

You can do it in sql with a union clause:

select sum(c), sequence 
from
(   
    select
        count(*) as c, concat(col1 ,col2 , col3) as sequence
        from t
        group by col1, col2, col3
    union
    select
        count(*) as c, concat(col2 ,col3 , col4) as sequence
        from t
        group by col2, col3, col4
    union (... and so on enumerating through the column combinations)
) as tt
group by sequence

I would imagine a pure java implementation would be quicker and have less memory overhead. But if you already have it in the database it may be quick enough.

Upvotes: 0

Related Questions