Reputation: 1
I want to convert the Azure Synapse script into GCP BigQuery. The Azure Datalake script is written in T-SQL, I want to convert it to BigQuery script. Please guide me - is there any procedure to convert a T-SQL query to BigQuery, like similar to SQL ? Thank you
Upvotes: 0
Views: 1138
Reputation: 35603
A tool is the BigQuery Migration Service . It includes tools that help with migration, including assessment and planning, SQL translation, data transfer, and data validation. The batch SQL translator and the interactive SQL translator can be used to prepare SQL queries and scripts to work in BigQuery. The batch and interactive SQL translators support translation from a wide range of SQL dialects. Suggestions:
Here's a phony TSQL query as an example for conversion, note the various difference in functions:
SELECT o.OrderID, c.CustomerName, CONVERT(varchar, o.OrderDate, 103) AS OrderDate, p.ProductName
FROM Orders o
OUTER APPLY (
SELECT TOP 1 c.CustomerName
FROM Customers c
WHERE c.CustomerID = o.CustomerID
) c
OUTER APPLY (
SELECT p.ProductName
FROM OrderDetails od
INNER JOIN Products p ON od.ProductID = p.ProductID
WHERE od.OrderID = o.OrderID
ORDER BY od.Quantity DESC
OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY
) p
WHERE JSON_VALUE(o.OrderDetails, '$.isShipped') = 'true'
This query retrieves data from the Orders table, joining it with the Customers and OrderDetails tables using OUTER APPLY. It also uses the JSON_VALUE function to extract a value from a JSON column in the Orders table, and CONVERT(varchar, date_column, 103) to format the OrderDate column as dd/mm/yyyy.
The first OUTER APPLY retrieves the CustomerName from the Customers table for each order. The second OUTER APPLY retrieves the ProductName with the highest quantity for each order from the OrderDetails and Products tables.
The WHERE clause filters the results to only include orders that have been shipped (isShipped is true in the JSON column).
Here's an equivalent query for BigQuery based on the previous T-SQL query:
SELECT o.OrderID, c.CustomerName, FORMAT_DATE('%d/%m/%Y', o.OrderDate) AS OrderDate, p.ProductName
FROM Orders o
LEFT JOIN (
SELECT CustomerID, CustomerName
FROM Customers
) c ON c.CustomerID = o.CustomerID
LEFT JOIN (
SELECT OrderID, ProductName
FROM (
SELECT od.OrderID, p.ProductName, ROW_NUMBER() OVER (PARTITION BY od.OrderID ORDER BY od.Quantity DESC) AS rn
FROM OrderDetails od
JOIN Products p ON od.ProductID = p.ProductID
) t
WHERE t.rn = 1
) p ON p.OrderID = o.OrderID
WHERE JSON_EXTRACT(o.OrderDetails, '$.isShipped') = 'true'
This query uses LEFT JOIN instead of OUTER APPLY to join the Customers and OrderDetails tables with the Orders table. It also uses FORMAT_DATE('%d/%m/%Y', date_column) instead of CONVERT(varchar, date_column, 103) to format the OrderDate column as dd/mm/yyyy.
The first LEFT JOIN retrieves the CustomerName from the Customers table for each order. The second LEFT JOIN retrieves the ProductName with the highest quantity for each order from the OrderDetails and Products tables using a subquery with ROW_NUMBER().
The WHERE clause filters the results to only include orders that have been shipped (isShipped is true in the JSON column).
Upvotes: 0