Reputation: 347
I have 2 tables:
1) table Masterdates which contains all dates since Jan 1, 1900
2) table Stockdata which contains stock data in the form date, symbol, open, high, low, close, volume (primary key = date, symbol)
This is what I'm looking to retrieve (presented here in CSV format)
MDate,SDate,Symbol,Open,High,...
6/4/2001,6/4/2001,Foo,47,49,...
6/5/2001,null,null,null,null,...
6/6/2001,6/6/2001,Foo,54,56,...
where MDate is from Masterdates and SDate is from Stockdata. I need to have the output start with the first (earliest) extent date for the desired symbol (in this example, Foo, starting on 6/4/2001) in Stockdata, and then include all dates in Masterdates up to and including the last (latest) available date for the desired symbol in Stockdata, outputting nulls where there is no corresponding Stockdata record for a given Masterdate record in the range described.
Is there a way to do this in a single query, a series of queries, and/or by adding auxiliary tables, that will yield fast results? Or will I have to dump out supersets of what I want, and then construct the final output using my (non-SQL) programing language?
TIA
Upvotes: 2
Views: 937
Reputation: 1
I would suggest filling out the missing entries from the Stockdata table and using an inner join. Should be much faster.
Upvotes: 0
Reputation: 28402
Tested in SQLITE3, your DB implementation may differ
SELECT m.date,
s.symbol,
s.open,
s.high,
s.low,
s.close,
s.volume
FROM masterdate AS m LEFT OUTER JOIN
stockdata AS s ON m.date = s.date
AND s.symbol = 'Foo'
WHERE m.date >= (SELECT MIN(date) FROM stockdata WHERE symbol = 'Foo')
AND m.date <= (SELECT MAX(date) FROM stockdata WHERE symbol = 'Foo')
If this does not execute quicky enough then you could pronably improve performance by setting variables for minimum and maximum value in one query and then using those in the main query. This would save you at least one index hit.
So (In SQL Server Syntax)
SET @symbol = 'Foo'
SELECT @mindate = MIN(date),
@maxdate = MAX(date)
FROM stockdata
WHERE stockdata.symbol = @symbol
SELECT m.date,
s.symbol,
s.open,
s.high,
s.low,
s.close,
s.volume
FROM masterdate AS m LEFT OUTER JOIN
stockdata AS s ON m.date = s.date
AND s.symbol = @symbol
WHERE m.date BETWEEN @mindate AND @maxdate
You will also need to be sure that you have an index on masterdate.date and a composite index on stockdata(date, symbol).
Upvotes: 7
Reputation: 74527
This is a classic left join:
SELECT * FROM masterdates
LEFT JOIN stockdata ON masterdates.date = stockdata.date;
Obviously this should be refined to only return the required columns.
Upvotes: 3
Reputation: 181290
If you use SQLServer, you can use a TSQL stored procedure to get your results back. It will have basically two lines:
1) To get the first date for data available 2) A query with an outer join
If you use Oracle, you can use a PL/SQL to write a similar (although a little bit more complicated) stored procedure to compute the answer you want.
Upvotes: 1