brainac
brainac

Reputation: 410

Join two big tables on multiple columns in Access

I have two tables:

  1. Transactions with around 300,000 rows, and following fields:

    • TransactionID
    • ClientID
    • Q
    • Y
    • Type
  2. Prices with around 10,000 rows, and following fields:

    • ClientID
    • Q
    • Y
    • Type
    • Price

I try to join them using 4 fields: ClientID, Q, Y, Type, but I get 2GB error all the time, saying result table is too big. The thing is the result should be 300,000 rows, so my guess is Access does it in some stupid way, creating cartesian product first and throwing an error. Funny thing is, that if I use just ClientID to join it works fine and gives couple millions of rows.

Can somebody help me to find the right way out? Maybe I just need to learn how joins are executed behind the scenes, but I haven't found anything meaningful so far.

I consider adding hash column that would have 4 columns concatenated in both tables and use it to join, but it's so ugly solution I just cringe.

My current SQL is like

SELECT TransactionID, Price 
FROM Transactions 
INNER JOIN Prices
   ON Transactions.ClientID = Prices.ClientID 
  AND Transactions.Q = Prices.Q 
  AND Transactions.Y = Prices.Y 
  AND Transactions.Type = Prices.Type

Upvotes: 2

Views: 1400

Answers (2)

Parfait
Parfait

Reputation: 107652

According to the Access 2016 specifications, the size limit is defined as:

2 gigabytes, minus the space needed for system objects.

Virtual tables in processing this query may be the issue. To display a SELECT statement results in a query window, data needs to be stored temporarily in a system object. If original file is already large, it will be even larger with the join (i.e., copy) of same tables in a JOIN query.

As recommended by the same above specs, consider migrating Prices into a different .accdb/.mdb file and query using linked tables (Access to Access). And if you retain same table names, original query would not need to change! Alternatively, query externally such as below. This way you divorce processing temporary table size from storage size to fit 2GB quota.

SELECT t.TransactionID, Price 
FROM Transactions t
INNER JOIN [C:\Path\To\Other\Database.accdb].[Prices] p
   ON t.ClientID = p.ClientID 
  AND t.Q = p.Q 
  AND t.Y = p.Y 
  AND t.Type = p.Type

Even still, store both tables in different file(s) and use a different, data-less Access app to run the query. In fact, this is ideally the split architecture recommended with Access development where storage (BackEnd) with tables is separated from application (FrontEnd) with queries/forms/reports/macros/modules.

Split Architecture

Upvotes: 1

Greg Viers
Greg Viers

Reputation: 3523

Try using a derived table to force a different query execution:

SELECT TransactionID, Price FROM 
(SELECT TransactionID, Price, Transactions.Q as TransQ, Prices.Q as PricesQ,
Transactions.Y as TransY, Prices.Y as PricesY, Transactions.Type as
TransType, Prices.Type as PricesType
FROM Transactions INNER JOIN Prices
ON Transactions.ClientID = Prices.ClientID) AS Step1
WHERE TransQ = PricesQ AND TransY = PricesY
AND TransType = PricesType

Upvotes: 1

Related Questions