Reputation: 33
I am working on an algorithm, using SQL and JAVA, concerning big datasets. In SQL I have a table with all the data and I want to use as much of SQL queries as possible before loading it into JAVA.
I generate random datasets (in Java), consisting exclusively of integers between 1 and 40001 and then insert them into a MySQL table. The rows can be of different lengths, with a maximum of 30 items/records (this includes the ID). So normally the amount of columns is 30 (so COL1, COL2, COL3,......COL30) but this amount will also be random at some point
What I want to do is count the occurrence of every distinct item in a table/dataset and put them in a new table with their count. This however is tricky since I want to count it over the entire table, not just one column. How do I do this?
To specify:
Take this table for example (this is a very small one in comparison with my usual tables):
ID | COL1 | COL2 | COL3 | COL4 | COL5 |
---------------------------------------
1 | 8 | 35 | 42 | 12 | 27 |
2 | 22 | 42 | 35 | 8 | NULL |
3 | 18 | 22 | 8 | NULL | NULL |
4 | 42 | 12 | 27 | 35 | 8 |
5 | 18 | 27 | 12 | 22 | NULL |
What I want to extract from this table is this:
Item | Count
-------------
8 | 3
35 | 3
40 | 1
12 | 3
27 | 3
22 | 3
42 | 2
43 | 1
18 | 2
It is also the case that an item can't be in the same row more than once, if that helps.
Can anyone help me? Or can it just simply not be done in SQL? Would it be better to do this in JAVA, performance-wise?
Thanks in advance!
Upvotes: 1
Views: 192
Reputation: 1269873
You can do this by unpivoting the data and then aggregating:
select col, count(*)
from (select col1 as col from t union all
select col2 from t union all
. . .
select col30 from t
) t
group by col;
If you don't have a known set of columns, then you will need to use dynamic SQL.
Upvotes: 4