Reputation: 493
Here is my sql (in mysql table)
select * from(SELECT sample_register.usin,
DATE_FORMAT(sample_register.doc,'%d-%m-%Y') as doc1,
sample_register.location,
sample_register.description,
sample_register.type,
sample_allocation.gamma,
gamma_results.act,
gamma_results.act_sd,
gamma_results.mdl,
gamma_results.bdl,
DATE_FORMAT(count_dt,'%d-%m-%Y') as count_dt
FROM sample_register
LEFT JOIN sample_allocation
ON sample_register.usin=sample_allocation.usin
LEFT JOIN gamma_results
ON gamma_results.usin = sample_register.usin
AND gamma_results.istp='Cs137'
WHERE mid(sample_register.usin,3,1)='F'
AND sample_register.doc BETWEEN '2015-01-01'
AND '2015-03-31'
AND sample_register.category='ter'
AND sample_allocation.gamma='Y'
ORDER BY mid(sample_register.usin,3,1),
sample_register.doc,
sample_register.usin) AS a
LEFT JOIN (SELECT sample_register.usin,
gamma_results.act,
gamma_results.act_sd,
gamma_results.mdl,
gamma_results.bdl
FROM sample_register
LEFT JOIN gamma_results
ON gamma_results.usin = sample_register.usin
AND gamma_results.istp='k40'
WHERE mid(sample_register.usin,3,1)='F'
AND sample_register.doc
BETWEEN '2015-01-01'
AND '2015-03-31'
AND (sample_register.category='ter')
ORDER BY mid(sample_register.usin,3,1),
sample_register.doc,
sample_register.usin) AS b
ON a.usin=b.usin
There are 4 records in the gamma_results table. two records each for 10/04/2015 and 18/04/2015.
USIN istp act count_dt
-----------------------------------------
15FML002 Cs137 0.00769 10/04/15
15FML002 K40 0 10/04/15
15FML002 Cs137 0.00608 18/04/15
15FML002 K40 12.117 18/04/15
Query output data in the following form (some fields I deleted for convenience)
15FML002 0.00769 Y 10/04/15 00
15FML002 0.00769 Y 10/04/15 12.117
15FML002 0.00608 Y 18/04/15 00
15FML002 0.00608 Y 18/04/15 12.117
But I want to get output in two records. That is like this
15FML002 0.00769 Y 10/04/15 00
15FML002 0.00608 Y 18/04/15 12.117
How can I reframe (join or union) the query to get output like this? /// edited on 30/04/2015
I am unable to create an sqlfiddle because of some proplem in their site. Here is the DDL and DML for the two tables sample_register and gamma results. The sample_allocation table can be ignored at this juncture.
CREATE TABLE `sample_register` (
`usin` varchar(11) NOT NULL,
`sample_id` varchar(7) NOT NULL,
`doc` date NOT NULL,
`location` varchar(255) DEFAULT NULL,
`category` varchar(50) DEFAULT NULL,
`type` varchar(255) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`fwt` decimal(10,2) DEFAULT NULL COMMENT 'This filed contains either fwt in gms or volume in ltr for milk or volume of air for particulate',
`dwt` decimal(10,2) DEFAULT NULL,
`ashwt` decimal(10,2) DEFAULT NULL,
`user` varchar(255) DEFAULT NULL,
`to_dt` date DEFAULT NULL COMMENT 'This is for particulate sample filter removal date',
`wc` decimal(10,2) DEFAULT NULL,
`oc` decimal(10,2) DEFAULT NULL,
`ac` decimal(10,2) DEFAULT NULL,
`status` varchar(1) DEFAULT NULL,
`remarks` varchar(255) DEFAULT NULL,
PRIMARY KEY (`usin`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `sample_register` VALUES ('15FML002', 'NIL', '2015-04-09', 'MALLAPUR', 'ter', 'MILK', 'milk', '2000.00', null, null, '1604015', null, null, null, null, null, null);
DROP TABLE IF EXISTS `gamma_results`;
CREATE TABLE `gamma_results` (
`usin` varchar(255) NOT NULL,
`sysid` varchar(255) NOT NULL,
`count_time` decimal(10,0) DEFAULT NULL,
`geo` varchar(255) DEFAULT NULL,
`vol` decimal(10,2) DEFAULT NULL,
`energy` decimal(10,2) DEFAULT NULL,
`istp` varchar(255) DEFAULT NULL,
`bkg` decimal(10,5) DEFAULT NULL,
`eff` decimal(10,3) DEFAULT NULL,
`sigma` decimal(10,5) DEFAULT NULL,
`ncps` decimal(10,5) DEFAULT NULL,
`sd` decimal(10,5) DEFAULT NULL,
`mdl` decimal(10,5) DEFAULT NULL,
`act` decimal(10,5) DEFAULT NULL,
`act_sd` decimal(10,5) DEFAULT NULL,
`bdl` varchar(1) DEFAULT NULL,
`entry_time` datetime DEFAULT NULL,
`entered_by` int(11) DEFAULT NULL,
`count_dt` date DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `gamma_results` VALUES ('15FML002', 'HPGE2', '25000', 'nb', '1000.00', '364.48', 'I131', '0.01000', '3.400', '0.00190', '-0.01000', '0.00041', '0.06882', null, '0.00000', 'Y', '2015-04-13 10:24:11', '1619381', '2015-04-10');
INSERT INTO `gamma_results` VALUES ('15FML002', 'HPGE2', '25000', 'nb', '1000.00', '661.66', 'Cs137', '0.00020', '2.060', '0.00027', '-0.00020', '0.00006', '0.00769', null, '0.00000', 'Y', '2015-04-13 10:24:57', '1619381', '2015-04-10');
INSERT INTO `gamma_results` VALUES ('15FML002', 'HPGE2', '25000', 'nb', '1000.00', '1460.73', 'K40', '0.00500', '0.911', '0.00134', '-0.00450', '0.00032', '1.37855', null, '0.00000', 'Y', '2015-04-13 10:25:37', '1619381', '2015-04-10');
INSERT INTO `gamma_results` VALUES ('15FML002', 'HPGE2', '15000', 'nb50', '2000.00', '661.66', 'Cs137', '0.00020', '3.380', '0.00035', '-0.00020', '0.00006', '0.00608', null, '0.00000', 'Y', '2015-04-20 10:21:48', '1619381', '2015-04-18');
INSERT INTO `gamma_results` VALUES ('15FML002', 'HPGE2', '15000', 'nb50', '2000.00', '1460.73', 'K40', '0.00500', '1.550', '0.00173', '0.04008', '0.00176', '0.52302', '12.11700', '0.53200', 'N', '2015-04-20 10:23:00', '1619381', '2015-04-18');
Upvotes: 20
Views: 1248
Reputation: 610
@masoondreamz, I have created a fiddle for a possible solution. It is - http://sqlfiddle.com/#!9/5bcae/27/0
The essential element of the solution was I grouped by the date in column Count_DT and have added it to the joint condition for the subqueries. I did trim the number of columns to the list you had showed in the sample result.
Upvotes: 0
Reputation: 5600
Every table you write should have a PRIMARY KEY
. This is a vitally important concept in relational databases. While you have a PRIMARY KEY
in sample_register
, you do not have one in gamma_results
.
Assuming you have the following rows:
15FML002 0.00769 Y 10/04/15 00
15FML002 0.00769 Y 10/04/15 12.117
15FML002 0.00608 Y 18/04/15 00
15FML002 0.00608 Y 18/04/15 12.117
And assuming that these are actually duplicates of distinct rows in your database, assigning primary keys will create a new column which will make your data look like this:
1 15FML002 0.00769 Y 10/04/15 00
1 15FML002 0.00769 Y 10/04/15 12.117
2 15FML002 0.00608 Y 18/04/15 00
2 15FML002 0.00608 Y 18/04/15 12.117
This is a bit of an oversimplification for your example, considering that within an explanation inside of your question, you stated that
For each USINs there are two different values for istp on a single count_dt. Hence on each count date there will be two records for a USIN with one for istp='k40' and another for istp=cs137. Since USIN and count date are common , I want to display it in a single row. Then if there are more than one (say' n') count dates for a USIN then there would be nx2 records. But I need only n rows.
I get that you understand how this data fits together, but your SQL queries will become cleaner and neater if you are able to tell the SQL database how the data fits together in the language it understands.
To understand where your Primary Key goes will help you to better logically understand your table structure. If you can't assign a Primary Key, that means the table probably needs to be split into chunks. This one could certainly benefit from a little normalization.
Clearly you have a 1:n relationship on USIN
which is well represented here, but you also have a 1:2 relationship on istp
. This is not reflected in a different table, but instead all in the same table, which means that combinations of USIN
and count_dt
are repeated twice throughout the entire table when they are exactly the same.
I am not suggesting that you can't make this query work. All you need to do is stick a GROUP BY
at the bottom. However, with no indexes to speak of your table is going to be incredibly slow. Even with indexes it's going to be slower than it needs to be! Importantly, without a PRIMARY KEY
your SQL database is going to be doing more work than it needs to by a long shot.
If you split your tables into two, providing a new PRIMARY KEY
for each USIN
/count_dt
pair, then create another table that settles your istp='k40'
and istp=cs137
requirements, you will only have to write one extra line of queries in your already long query, and it will probably more often than not save you a lot of time querying.
Instead of SELECT * FROM tableA LEFT JOIN tableB on tableA.USIN=tableB.USIN...
you would write:
SELECT * FROM tableA
LEFT JOIN tableB on tableA.USIN=tableB.USIN
LEFT JOIN tableX on tableB.id = tableX.tableB_id
....
You should end up with the same dataset. If you just want the unique values, all you need to do is get rid of the second JOIN
.
If for whatever reason you cannot change your table structure, then your alternative would be to create a composite primary key, consisting of USIN
, count_dt
and istp
, as these should be unique and should only show up in a specific combination once in the table. This can quickly balloon out, but should give you the general gist.
To create distinct values using this method, you should have a compound index on USIN
and count_dt
, and then use GROUP BY USIN, count_dt
.
Upvotes: 3
Reputation: 2307
I have seen output like this before.
It is caused by a tiny error in the joining. If I remember correctly I solved this by nesting some of my joins as sub queries so I was adding each table to an existing table along a join. I probably over egged the mix but it did work nicely. (UPDATE: Yeah, I see now that is exactly what you are doing).
At some point in that mix the join has not been unique or the way you have arranged it has not been unique and so SQL has given you all the combinations it could come up with.
(Assuming, of course, that it is not just a missing GROUP BY
that is upsetting things. That's the other source of problems like this.)
I will have a careful look through your SQL and see if I can spot the problem but a little trial and error would be much, much faster.
UPDATE 1: This is either a really impressive hack or an error and I cannot fathom which.
LEFT JOIN gamma_results
ON gamma_results.usin = sample_register.usin
AND gamma_results.istp='Cs137'
I have never seen a conditional as part of a join. I could not tell you if this would even work but my gut reaction is to want to put this in the WHERE
clause. Feel free to take me to school on this as you seem to know what you are doing and I do not have hands on to the actual project (nor do I know everything).
Update 2: Given that your hard coded join must be close to working my feeling is that I would want a GROUP by in the nested SELECT as you are using aggregate functions (count
).
I recall once making a quite complex query that when you get right down to it was effectively saying "for each table_a add table_b". Honestly make friends with GROUP BY
and if that does not solve your problem then I am a small inoffensive monkey.
Upvotes: 8
Reputation: 17289
First of all you should provide same data and sqlfiddle if you need help with query for some specific dataset.
Since I have no data for the fiddle here is my guess:
SELECT sample_register.usin,
DATE_FORMAT(sample_register.doc,'%d-%m-%Y') as doc1,
sample_register.location,
sample_register.description,
sample_register.type,
sample_allocation.gamma,
gr.act,
gr.act_sd,
gr.mdl,
gr.bdl,
gr.count_dt,
grK40.act
FROM sample_register
INNER JOIN sample_allocation
ON sample_register.usin=sample_allocation.usin
AND sample_allocation.gamma='Y'
LEFT JOIN (
SELECT
usin,
act,
act_sd,
mdl,
bdl,
count_dt,
DATE_FORMAT(count_dt,'%d-%m-%Y') as count_dt_formatted
FROM gamma_results
WHERE istp='Cs137'
) gr
ON gr.usin = sample_register.usin
LEFT JOIN gamma_results grK40
ON grK40.usin = gr.usin
AND grK40.istp='k40'
AND grK40.count_dt = gr.count_dt
WHERE mid(sample_register.usin,3,1)='F'
AND sample_register.doc BETWEEN '2015-01-01' AND '2015-03-31'
AND sample_register.category='ter'
ORDER BY mid(sample_register.usin,3,1),
sample_register.doc,
sample_register.usin
But it is just a guess, because it looks very weir to me.
You wrote:
There are 4 records in the gamma_results table. two records each for 10/04/2015 and 18/04/2015.: USIN
,istp
,act
,count_dt
but in your query you use act,act_sd,mdl,bdl,DATE_FORMAT(count_dt,'%d-%m-%Y') as count_dt
so we can assume that you have some other columns there like: act_sd,mdl,bdl
.
Underneath you wrote: Query output data in the following form (some fields I deleted for convenience)
15FML002 0.00769 Y 10/04/15 00
15FML002 0.00769 Y 10/04/15 12.117
Even if some field deleted, what fields are here?
Logically it is: usin,act
, UNKNOWN,count_dt
,UNKNOWN (equal to act
when istp='K40'
). but it is impossible, because you haven't such field in your query request. It seems to me that provided output was get as result of some other query, not that one you show us.
But so far, here is my guess. You are welcome if any questions.
Upvotes: 4