Reputation: 7
I'm wrangling with a rather simple query. However, given my low level of Access and SQL competency, I've hit a roadblock with my current project. Details are below and thank you all in advance for your patience.
Basically, I am trying to use historical financial data to test a well known bankruptcy predictor model. The financial data is stored in an Annual format (table listing below). The database is structured such that the each company has one general information record in the company record table (IDX_FS
) and multiple records for each year of existence in the financial statement data tables (DATA_BS
etc.).
In each DATA table, there is one field that contains the specific year of the data record for each company [4DTYR]
and its respective financial data. The data in this field repeats and exists for each company and for every year it existed.
For example:
[CONAME] [4DTYR] [A_TOTAL]
Apple Inc. 2009 200
Apple Inc. 2010 220
Apple Inc. 2011 240
Google Inc. 2009 180
Google Inc. 2010 170
Google Inc. 2011 160
The problem I am running into is given that the data in the field [4DTYR]
exists and repeats in various tables from which data is being used to calculate arithmatic in a handful of expressions, I end up with a huge amount of repeated (and what looks a permutation) data in my query output.
I've detailed the tables, fields and expressions below in addition to the SQL script. Note that I've tried adding a condition under WHERE
that attempts to set all the [4DTYR]
dates in the different tables as the same. That portion is highlighted in magenta. This still doesn't seem to work as I only get output for 1 year only, when there are 20 years of data. Furthermore, when I run the query without the expressions, the existing paramaters gives me output with ~500 records.
Thanks for your responses. So, I've taken Gords advice and made the modification below. However, I receive a JOIN syntax error. Note that IDX_FS contains the CUSIP field, but not the 4DTYR field. So, I used AND to add to the original statement. Suggestions? many thanks.
FROM (((IDX_FS LEFT JOIN DATA_BS ON IDX_FS.CUSIP = DATA_BS.CUSIP) LEFT JOIN DATA_Footnotes ON IDX_FS.CUSIP = DATA_Footnotes.CUSIP) LEFT JOIN DATA_IS ON IDX_FS.CUSIP = DATA_IS.CUSIP) LEFT JOIN DATA_SP ON IDX_FS.CUSIP = DATA_SP.CUSIP AND (((DATA_BS LEFT JOIN DATA_IS ON DATA_BS.CUSIP = DATA_IS.CUSIP AND DATA_BS.4DTYR = DATA_IS.4DTYR) LEFT JOIN DATA_SP ON DATA_BS.CUSIP = DATA_SP.CUSIP AND DATA_BS.4DTYR = DATA_SP.4DTYR) LEFT JOIN DATA_Footnotes.4DTYR ON DATA_BS.CUSIP = DATA_Footnotes.CUSIP AND DATA_BS.4DTYR = DATA_Footnotes.4DTYR
Upvotes: 0
Views: 9989
Reputation: 123594
When you are joining the tables that have both the company ID [CUSIP] and the year [4DTYR] you are only joining on [CUSIP], so you are getting duplicate rows for the various permutations of [4DTYR] in the related tables that also have that field. You need to join on both [CUSIP] and [4DTYR] to avoid those duplicates.
In Access' query designer such joins will appear as two lines running between each table: one connecting [CUSIP] to [CUSIP] and the other connecting [4DTYR] to [4DTYR]. In SQL the joins will look something like
... TableX LEFT JOIN TableY ON TableX.CUSIP = TableY.CUSIP AND TableX.4DTYR = TableY.4DTYR
Upvotes: 2