user119282
user119282

Reputation: 347

Complex(?) SQL join query

I have 2 tables:

1) table Masterdates which contains all dates since Jan 1, 1900

2) table Stockdata which contains stock data in the form date, symbol, open, high, low, close, volume (primary key = date, symbol)

This is what I'm looking to retrieve (presented here in CSV format)

MDate,SDate,Symbol,Open,High,...

6/4/2001,6/4/2001,Foo,47,49,...

6/5/2001,null,null,null,null,...

6/6/2001,6/6/2001,Foo,54,56,...

where MDate is from Masterdates and SDate is from Stockdata. I need to have the output start with the first (earliest) extent date for the desired symbol (in this example, Foo, starting on 6/4/2001) in Stockdata, and then include all dates in Masterdates up to and including the last (latest) available date for the desired symbol in Stockdata, outputting nulls where there is no corresponding Stockdata record for a given Masterdate record in the range described.

Is there a way to do this in a single query, a series of queries, and/or by adding auxiliary tables, that will yield fast results? Or will I have to dump out supersets of what I want, and then construct the final output using my (non-SQL) programing language?

TIA

Upvotes: 2

Views: 937

Answers (4)

Ahmad
Ahmad

Reputation: 1

I would suggest filling out the missing entries from the Stockdata table and using an inner join. Should be much faster.

Upvotes: 0

Steve Weet
Steve Weet

Reputation: 28402

Tested in SQLITE3, your DB implementation may differ

SELECT   m.date, 
         s.symbol, 
         s.open, 
         s.high, 
         s.low, 
         s.close, 
         s.volume
FROM     masterdate AS m LEFT OUTER JOIN 
         stockdata AS s ON m.date = s.date
AND      s.symbol = 'Foo'
WHERE    m.date >= (SELECT MIN(date) FROM stockdata WHERE symbol = 'Foo')
AND      m.date <= (SELECT MAX(date) FROM stockdata WHERE symbol = 'Foo')

If this does not execute quicky enough then you could pronably improve performance by setting variables for minimum and maximum value in one query and then using those in the main query. This would save you at least one index hit.

So (In SQL Server Syntax)

SET @symbol = 'Foo'

SELECT @mindate = MIN(date),
       @maxdate = MAX(date)
FROM   stockdata
WHERE  stockdata.symbol = @symbol


SELECT   m.date, 
         s.symbol, 
         s.open, 
         s.high, 
         s.low, 
         s.close, 
         s.volume
FROM     masterdate AS m LEFT OUTER JOIN 
         stockdata AS s ON m.date = s.date
AND      s.symbol = @symbol
WHERE    m.date BETWEEN @mindate AND @maxdate

You will also need to be sure that you have an index on masterdate.date and a composite index on stockdata(date, symbol).

Upvotes: 7

Josef Pfleger
Josef Pfleger

Reputation: 74527

This is a classic left join:

SELECT * FROM masterdates
    LEFT JOIN stockdata ON masterdates.date = stockdata.date;

Obviously this should be refined to only return the required columns.

Upvotes: 3

Pablo Santa Cruz
Pablo Santa Cruz

Reputation: 181290

If you use SQLServer, you can use a TSQL stored procedure to get your results back. It will have basically two lines:

1) To get the first date for data available 2) A query with an outer join

If you use Oracle, you can use a PL/SQL to write a similar (although a little bit more complicated) stored procedure to compute the answer you want.

Upvotes: 1

Related Questions