Reputation: 717
I want to perform Group BY
on two csv files and then perform Left Join on the outcome of both tables through VBA ADO Query in Excel. My end motive is to print the recordset.
Here is what I have done so far.
SELECT * FROM (
SELECT f1.[c3],
f1.[c4],
f1.[c5],
f1.[c6],
Sum(f1.[c8]) AS SUMDATA
FROM test1.csv F1
GROUP BY f1.[c3],
f1.[c4],
f1.[c5],
f1.[c6]) AS f3
LEFT JOIN SELECT * FROM (
SELECT f2.[c3],
f2.[c4],
f2.[c5],
f2.[c6],
Sum(f2.[c8]) AS SUMDATA
FROM test2.csv f2
GROUP BY f2.[c3],
f2.[c4],
f2.[c5],
f2.[c6]) AS f4
on f3.[c3]+ f3.[c4]+ f3.[c5]+ f3.[c6] = f4.[c3]+ f4.[c4]+ f4.[c5]+ f4.[c6]
WHERE f3.[SUMDATA] <> f4.[SUMDATA]
This shows a syntax error. How to implement this? Any help is much appreciated. TIA.
An update -
I manage to implement 1 LEFT JOIN
and 2 GROUP BYs
between 2 tables. As per the request, here are few details regarding my dataset.
It consists of fields - c1, c2 .... c8.
c8 is my target field.
My expected output - I do not need c7, c1 and c2 in output sheet. The info of c7, c1 and c2 is irrelevant. I need to do 5 things with my data.
Group Sum the c8 field based on c3, c4, c5 and c6 fields in CSV file 1 and store target field as SUMDATA
Group Sum the c8 field based on c3, c4, c5 and c6 fields in CSV file 2 and store target field as SUMDATA
Find the mismatched SUMDATA field entries between CSV1 and CSV2 (I used LEFT JOIN
for this on concatenated c3, c4, c5, c6 fields)
Find the entries which are present in CSV1 but not in CSV2
Find the entries which are present in CSV2 but not in CSV1
Currently, I manage to write the code that works till step 3. I need to store the grouped tables temporarily which I got from Step 1 and 2, to perform the steps 4 and 5 which can be done through 2 more UNION
, LEFT JOINs
, and WHERE
combination. This is where I am stuck right now.
Upvotes: 1
Views: 879
Reputation: 3801
This isn't really an answer but the formatting is important for readability.
It looks like there's a lot wrong with your SQL.
The syntax should look like this (assuming querying a csv works like you are thinking):
SELECT SUB1.Field1,
SUB1.AggField AS Agg1,
SUB2.AggField AS Agg2
FROM (SELECT Field1,
MAX(Field2) Agg_Field
FROM Table1 T1
GROUP
BY Field1
) SUB1
LEFT
JOIN (SELECT Field1,
MAX(Field2) Agg_Field
FROM Table1 T2
GROUP
BY Field1
) SUB2
ON SUB1.Field1 = SUB2.Field1
WHERE SUB1.AggField <> SUB2.AggField;
Also, you are missing a comma here: F1.[c5] F1.[c6] in the first chunk.
Try fixing the SQL syntax like this and see where that gets you.
Upvotes: 3