Reputation: 11
Consider the table below:
Col1 Col2 Col3
123 ABC 20/5/2010
123 CDS 21/5/2010
123 VDS 22/5/2010
123 ABC 23/5/2010
123 VDS 24/5/2010
123 CDS 25/5/2010
123 ABC 26/5/2010
I need to fetch the first occurrence of CDS and calculate the time diff between the next row. Similarly I need to find out the next occurrence of CDS and calculate the time diff with the next row. This has to go on until there are no occurrences of CDS left in the table.
Will be grateful if someone can help on this!!
Upvotes: 1
Views: 711
Reputation: 96
Either I don't understand what seems like a simple question or others are over thinking it.
SELECT col1, col2, col3,
(SELECT MIN(a.col3) FROM tab1 a WHERE a.col3 > z.col3) - z.col3 AS DaysDiff
FROM tab1 z
WHERE z.col2 = "CDS"
Upvotes: 0
Reputation: 54292
What is your desired output? Is it something like:
123 ABC 20/5/2010
123 CDS 21/5/2010
123 VDS 22/5/2010 1 day, 0:00:00
123 ABC 23/5/2010
123 VDS 24/5/2010
123 CDS 25/5/2010
123 ABC 26/5/2010 1 day, 0:00:00
If so, then I think the simplest way to get it is: create program in Python or similar language, select your data using SQL and calculate date diff with language of your choice. In "normal" SQL there is no such thing like "next row", where in other languages you can save date of last CDS and use it with next loop iteration.
This output was created with Python:
import time
import datetime
TXT = """123 ABC 20/5/2010
123 CDS 21/5/2010
123 VDS 22/5/2010
123 ABC 23/5/2010
123 VDS 24/5/2010
123 CDS 25/5/2010
123 ABC 26/5/2010"""
def txt2time(ts):
tpl = time.strptime(ts, '%d/%m/%Y')
return time.mktime(tpl)
last_date = ''
for line in TXT.split('\n'):
date_diff = ''
arr = line.split()
if last_date:
date_diff = '%s' % (datetime.timedelta(seconds = (txt2time(arr[2]) - txt2time(last_date))))
last_date = ''
if arr[1] == 'CDS':
last_date = arr[2]
print('%s %s' % (line.strip(), date_diff))
As you see I iterate over text lines, but you can easily change first loop with split('\n')
to loop on recordset:
for row in cursor.fetchall():
if row[0] == 'CDS':
...
(you can find Python/Jython examples on many web pages including my questions and answers on SO).
I think it is possible to find such solution in SQL only. You will need function that return date from next row. And I think this may not be easy to create such function because such function will have to behave just like your select with filtering and ordering.
Upvotes: 1
Reputation: 753525
With the sample data, there is but one pair of rows with the value CDS in Col2, so there is but one row in the output. It is not clear what you'd expect if there were 4 rows with CDS. Your wording might be intended to imply that the first pair would contribute one row and the second pair would contribute a second row. Or it might be that you need to find the differences between consecutive occurrences of CDS, so that the 4 rows of data would produce 3 rows of output. (The question also leaves it open to discussion whether this applies within a single value for Col1, or whether the Col1 is immaterial to the result.)
Since the ambiguity exists, I'll tackle the second option, assuming that the CDS entries all have to be for the same value in Col1 (but that there may be many different values in Col1).
You don't mention which version of Informix you have; I am assuming IDS 11.50. The syntax may not work in earlier versions.
As so often, the table is anonymous in the question - so it is hereby designated Tab1.
SELECT t1a.col1, t1a.col2, t1a.col3, t1b.col3 AS col4,
t1b.col3 - t1a.col3 AS delta
FROM tab1 AS t1a JOIN tab1 AS t1b
ON t1a.col1 = t1b.col1 AND
t1a.col3 < t1b.col3 AND
t1a.col2 = t1b.col2 AND
t1a.col2 = 'CDS' AND
NOT EXISTS(SELECT *
FROM tab1 AS t1c
WHERE t1c.col3 > t1a.col3 AND
t1c.col3 < t1b.col3 AND
t1c.col1 = t1a.col1 AND
t1c.col2 = t1a.col2
);
The '<' join orders the pairs of dates so the 't1a' value is less than the 't1b' value; the NOT EXISTS clause ensures that the are adjacent to each other by asserting that there is no row in tab1 with the same values for col1 and col2 and with a date that comes after the earlier date and before the later date. This is the crucial part of the query.
col1 col2 col3 col4 delta
123 CDS 21/05/2010 25/05/2010 4
123 CDS 14/04/2010 22/04/2010 8
123 CDS 22/04/2010 21/05/2010 29
120 CDS 11/05/2010 16/05/2010 5
121 CDS 21/04/2010 30/04/2010 9
CREATE TABLE tab1 (col1 SMALLINT, col2 CHAR(3), col3 DATE);
Running with DBDATE=DMY4/.
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'ABC', '20/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'CDS', '21/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'VDS', '22/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'ABC', '23/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'VDS', '24/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'CDS', '25/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'ABC', '26/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'ABC', '10/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'CDS', '14/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'VDS', '12/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'ABC', '13/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'VDS', '19/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'CDS', '22/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (123, 'ABC', '16/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (120, 'ABC', '10/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (120, 'CDS', '11/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (120, 'VDS', '12/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (120, 'ABC', '13/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (120, 'VDS', '14/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (120, 'CDS', '16/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '17/5/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'CDS', '21/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '22/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'VDS', '23/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '24/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'VDS', '25/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '26/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '27/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '28/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'ABC', '29/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (121, 'CDS', '30/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (122, 'ABC', '23/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (122, 'VDS', '24/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (122, 'CDS', '25/4/2010');
INSERT INTO tab1(col1, col2, col3) VALUES (122, 'ABC', '26/4/2010');
Upvotes: 0