faizan
faizan

Reputation: 1

How to convert T-SQL into SQL BigQuery

I want to convert the Azure Synapse script into GCP BigQuery. The Azure Datalake script is written in T-SQL, I want to convert it to BigQuery script. Please guide me - is there any procedure to convert a T-SQL query to BigQuery, like similar to SQL ? Thank you

Upvotes: 0

Views: 1138

Answers (1)

Paul Maxwell
Paul Maxwell

Reputation: 35603

A tool is the BigQuery Migration Service . It includes tools that help with migration, including assessment and planning, SQL translation, data transfer, and data validation. The batch SQL translator and the interactive SQL translator can be used to prepare SQL queries and scripts to work in BigQuery. The batch and interactive SQL translators support translation from a wide range of SQL dialects. Suggestions:

  • Learn the differences between TSQL and BigQuery SQL: BigQuery uses a different SQL dialect than TSQL, so it's important to understand the differences between the two.
  • Utilize the BigQuery Migration Service.
  • Optimize queries for BigQuery: This can include using partitioning and clustering to reduce query costs and perhaps using denormalization to reduce the number of joins.
  • Use BigQuery's native data types: BigQuery has its own set of native data types, so use them - when relevant.
  • Test thoroughly: Before migrating TSQL code to BigQuery, it's important to test thoroughly to ensure that the code works as expected.

Here's a phony TSQL query as an example for conversion, note the various difference in functions:

SELECT o.OrderID, c.CustomerName, CONVERT(varchar, o.OrderDate, 103) AS OrderDate, p.ProductName
FROM Orders o
OUTER APPLY (
  SELECT TOP 1 c.CustomerName
  FROM Customers c
  WHERE c.CustomerID = o.CustomerID
) c
OUTER APPLY (
  SELECT p.ProductName
  FROM OrderDetails od
  INNER JOIN Products p ON od.ProductID = p.ProductID
  WHERE od.OrderID = o.OrderID
  ORDER BY od.Quantity DESC
  OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY
) p
WHERE JSON_VALUE(o.OrderDetails, '$.isShipped') = 'true'

This query retrieves data from the Orders table, joining it with the Customers and OrderDetails tables using OUTER APPLY. It also uses the JSON_VALUE function to extract a value from a JSON column in the Orders table, and CONVERT(varchar, date_column, 103) to format the OrderDate column as dd/mm/yyyy.

The first OUTER APPLY retrieves the CustomerName from the Customers table for each order. The second OUTER APPLY retrieves the ProductName with the highest quantity for each order from the OrderDetails and Products tables.

The WHERE clause filters the results to only include orders that have been shipped (isShipped is true in the JSON column).

Here's an equivalent query for BigQuery based on the previous T-SQL query:

SELECT o.OrderID, c.CustomerName, FORMAT_DATE('%d/%m/%Y', o.OrderDate) AS OrderDate, p.ProductName
FROM Orders o
LEFT JOIN (
  SELECT CustomerID, CustomerName
  FROM Customers
) c ON c.CustomerID = o.CustomerID
LEFT JOIN (
  SELECT OrderID, ProductName
  FROM (
    SELECT od.OrderID, p.ProductName, ROW_NUMBER() OVER (PARTITION BY od.OrderID ORDER BY od.Quantity DESC) AS rn
    FROM OrderDetails od
    JOIN Products p ON od.ProductID = p.ProductID
  ) t
  WHERE t.rn = 1
) p ON p.OrderID = o.OrderID
WHERE JSON_EXTRACT(o.OrderDetails, '$.isShipped') = 'true'

This query uses LEFT JOIN instead of OUTER APPLY to join the Customers and OrderDetails tables with the Orders table. It also uses FORMAT_DATE('%d/%m/%Y', date_column) instead of CONVERT(varchar, date_column, 103) to format the OrderDate column as dd/mm/yyyy.

The first LEFT JOIN retrieves the CustomerName from the Customers table for each order. The second LEFT JOIN retrieves the ProductName with the highest quantity for each order from the OrderDetails and Products tables using a subquery with ROW_NUMBER().

The WHERE clause filters the results to only include orders that have been shipped (isShipped is true in the JSON column).

Upvotes: 0

Related Questions