Naive_Natural2511
Naive_Natural2511

Reputation: 717

Group by on two tables and perform Left join on outcome VBA ADODB SQL Query

I want to perform Group BY on two csv files and then perform Left Join on the outcome of both tables through VBA ADO Query in Excel. My end motive is to print the recordset.

Here is what I have done so far.

     SELECT * FROM (
                   SELECT   f1.[c3],
                            f1.[c4],
                            f1.[c5],
                            f1.[c6],
                            Sum(f1.[c8]) AS SUMDATA
                   FROM     test1.csv F1 
                   GROUP BY f1.[c3],
                            f1.[c4],
                            f1.[c5],
                            f1.[c6]) AS f3
LEFT JOIN  SELECT * FROM   (
                SELECT   f2.[c3],
                         f2.[c4],
                         f2.[c5],
                         f2.[c6],
                         Sum(f2.[c8]) AS SUMDATA
                FROM     test2.csv f2
                GROUP BY f2.[c3],
                         f2.[c4],
                         f2.[c5],
                         f2.[c6]) AS f4
on  f3.[c3]+ f3.[c4]+ f3.[c5]+ f3.[c6] = f4.[c3]+ f4.[c4]+ f4.[c5]+ f4.[c6]
WHERE  f3.[SUMDATA] <> f4.[SUMDATA] 

This shows a syntax error. How to implement this? Any help is much appreciated. TIA.

An update -

I manage to implement 1 LEFT JOIN and 2 GROUP BYs between 2 tables. As per the request, here are few details regarding my dataset. It consists of fields - c1, c2 .... c8. c8 is my target field.

My expected output - I do not need c7, c1 and c2 in output sheet. The info of c7, c1 and c2 is irrelevant. I need to do 5 things with my data.

  1. Group Sum the c8 field based on c3, c4, c5 and c6 fields in CSV file 1 and store target field as SUMDATA

  2. Group Sum the c8 field based on c3, c4, c5 and c6 fields in CSV file 2 and store target field as SUMDATA

  3. Find the mismatched SUMDATA field entries between CSV1 and CSV2 (I used LEFT JOIN for this on concatenated c3, c4, c5, c6 fields)

  4. Find the entries which are present in CSV1 but not in CSV2

  5. Find the entries which are present in CSV2 but not in CSV1

Currently, I manage to write the code that works till step 3. I need to store the grouped tables temporarily which I got from Step 1 and 2, to perform the steps 4 and 5 which can be done through 2 more UNION, LEFT JOINs, and WHERE combination. This is where I am stuck right now.

Upvotes: 1

Views: 879

Answers (1)

Error_2646
Error_2646

Reputation: 3801

This isn't really an answer but the formatting is important for readability.

It looks like there's a lot wrong with your SQL.

The syntax should look like this (assuming querying a csv works like you are thinking):

SELECT SUB1.Field1,
       SUB1.AggField AS Agg1,
       SUB2.AggField AS Agg2
  FROM (SELECT Field1,
               MAX(Field2) Agg_Field
          FROM Table1 T1
         GROUP
            BY Field1
       ) SUB1
  LEFT
  JOIN (SELECT Field1,
               MAX(Field2) Agg_Field
          FROM Table1 T2
         GROUP
            BY Field1
       ) SUB2
    ON SUB1.Field1 = SUB2.Field1
 WHERE SUB1.AggField <> SUB2.AggField;

Also, you are missing a comma here: F1.[c5] F1.[c6] in the first chunk.

Try fixing the SQL syntax like this and see where that gets you.

Upvotes: 3

Related Questions