Lauren
Lauren

Reputation: 65

My Joins in query not pulling through correctly

Good evening. Could someone please help me with the following. I am trying to join two tables.The first id wbr_global.gl_ap_details. This stores historic GL information. The second table sandbox.utr_fixed_mapping is where account mapping is stored. For example, ana ccount number 60820 is mapped as Employee relation. The first table needs the mapping from the second table linked on the account number. The output I am getting is not right and way to bug. Any help would be appreciated!

Output

select sandbox.utr_fixed_mapping_na.new_mapping_1,sum(wbr_global.gl_ap_details.amount)
from wbr_global.gl_ap_details 
LEFT JOIN sandbox.utr_fixed_mapping_na ON wbr_global.gl_ap_details.account_number = sandbox.utr_fixed_mapping_na.account_number   
Where gl_ap_details.cost_center = '1172'
and gl_ap_details.period_name = 'JUL-21'
and gl_ap_details.ledger_name = 'Amazon.com, Inc.'
Group by 1;

enter image description here I tried adding the cast function but after 5000 seconds of the query running I canceled it. enter image description here

Upvotes: 0

Views: 67

Answers (2)

DRapp
DRapp

Reputation: 48139

The query itself appears ok, but minor changes. Learn to use table "aliases". This way you don't have to keep typing long database.table.column all over. Additionally, SQL is easier to read doing it that way anyhow. Notice the aliases "gl" and "fm" after the tables are declared, then these aliases are used to represent the columns.. Easier to read, would you agree.

Added GL Account number as described below the query.

select 
        gl.account_number,
        fm.new_mapping_1,
        sum(gl.amount)
    from 
        wbr_global.gl_ap_details gl
            LEFT JOIN sandbox.utr_fixed_mapping_na fm
                ON gl.account_number = fm.account_number   
    Where 
            gl.cost_center = '1172'
        and gl.period_name = 'JUL-21'
        and gl.ledger_name = 'Amazon.com, Inc.'
    Group by 
        gl.account_number,
        fm.new_mapping_1

Now, as for your query and getting null. This just means that there are records within the gl_ap_details table with an account number that is not found in the utr_fixed_mapping_na table. So, to see WHAT gl account number does NOT exist, I have added it to the query. Its possible there are MULTIPLE records in the gl_ap_details that are not found in the mapping table. So, you may get

GLAccount    Description    SumOfAmount
glaccount1   null           $someAmount
glaccount37  null           $someAmount
glaccount49  null           $someAmount
glaccount72  Depreciation   $someAmount
glaccount87  Real Estate    $someAmount
glaccount92  Building       $someAmount
glaccount99  Salaries       $someAmount

I obviously made-up glaccounts just to show the purpose. You may have multiple where the null's total amount is actually masking how many different gl account numbers were NOT found.

Once you find which are missing, you can check / confirm they SHOULD be in the mapping table.

FEEDBACK.

Since you do realize the missing numbers, lets consider a Cartesian result. If there are multiple entries in the mapping table for the same G/L account number, you will get a Cartesian result thus bloating your numbers. To clarify, lets say your mapping table has

Mapping file.
GL    Descr1     NewMapping
1     test       Salaries
1     testView   Buildings
1     Another    Depreciation

And your GL_AP_Details has
GL   Amount
1    $100

Your total for the query would result in $300 because the query is trying to join the AP Details GL #1 to EACH of the entries in the mapping file thus bloating the amount. You could also add a COUNT(*) as NumberOfEntries to the query to see how many transactions it THINKS it is processing. Is there some "unique ID" in the GL_AP_Details table? If so, then you could also do a count of DISTINCT ID values. If they are different (distinct is lower than # of entries), I think THAT is your culprit.

select 
        fm.new_mapping_1,
        sum(gl.amount),
        count(*) as NumberOfEntries,
        count( distinct gl.UniqueIdField ) as DistinctTransactions
    from 
        wbr_global.gl_ap_details gl
            LEFT JOIN sandbox.utr_fixed_mapping_na fm
                ON gl.account_number = fm.account_number   
    Where 
            gl.cost_center = '1172'
        and gl.period_name = 'JUL-21'
        and gl.ledger_name = 'Amazon.com, Inc.'
    Group by 
        fm.new_mapping_1

Might you also need to limit the mapping table for a specific prophecy or mec view?

Upvotes: 1

Chris Schaller
Chris Schaller

Reputation: 16554

If you "think" that the result of an aggregate is wrong, then the easiest way to verify this is to select the individual rows that correlate to 1 record in the aggregate output and inspect the records, looking for duplications.

For instance, pick 'Building Management':

SELECT fixed.new_mapping_1,details.amount,*
FROM wbr_global.gl_ap_details details
LEFT JOIN sandbox.utr_fixed_mapping_na fixed ON details.account_number = fixed.account_number   
WHERE details.cost_center = '1172'
  AND details.period_name = 'JUL-21'
  AND details.ledger_name = 'Amazon.com, Inc.'
  AND details.account_number = 'Building Management'

Notice that we tack on a ,* to the end of the projection, this will show you everything that the query has access to, you should look for repeating sections of data that you were not expecting, then depending on which table they originate from your might add additional criteria to the JOIN, or to the WHERE or you might need to group by additional columns.

This type of issue is really hard to comment on in a forum like this because it is highly specific to your schema, and the data contained within it, making solutions highly subjective to criteria you are not likely to publish online.

Generally if you think a calculation is wrong, you need to manually compute it to verify, this above advice helps you to inspect the data your query is using, you should either construct your own query or use other tools to build the data set that helps you to manually compute the correct values, then work them back into or replace your original query.

The speed issues are out of scope here, we can comment on the poor schema design but I suspect you don't have a choice. In the utr_fixed_mapping_na table you should make the account_number have the same column type as the source data, or add a new column that has the data in the original type, then you can setup indexes on the columns to improve the speed of the join.

Upvotes: 0

Related Questions