Celador
Celador

Reputation: 183

Show Disk Space Used for all Tables - Azure SQL Data Warehouse

We are transitioning across to Azure SQL Data Warehouse - and an issue that's been highlighted is the need to change some smaller tables from Round-Robin / Hash-distributed to Replicated to improve performance.

MS Design Guidance (See Here) suggests one criteria for this decision is Tables that take up less than 2Gb Disk Space. i.e. these tables could be made into Replicated tables. They suggest using DBCC PDW_SHOWSPACEUSED to determine this.

I can run this against the whole DB, or one specific table, but i'd really like to get a list of all tables and the space used (preferably in MB) - but it's beyond me.

A lot of google searching either gives me the two basic commands I already know (against the whole DB / against 1 table) or give me SQL Server queries that don't run against Azure DW - e.g. using sys.allocation_units - which is not supported in Azure DW.

Upvotes: 3

Views: 1807

Answers (1)

Collin Barrett
Collin Barrett

Reputation: 2569

I was just directed to this Microsoft article that provides a pretty solid solution to this problem.

In particular, create a view:

CREATE VIEW dbo.vTableSizes
AS
    WITH base
    AS (
           SELECT
                   GETDATE()                                                                                     AS [execution_time],
                   DB_NAME()                                                                                     AS [database_name],
                   s.name                                                                                        AS [schema_name],
                   t.name                                                                                        AS [table_name],
                   QUOTENAME(s.name) + '.' + QUOTENAME(t.name)                                                   AS [two_part_name],
                   nt.[name]                                                                                     AS [node_table_name],
                   ROW_NUMBER() OVER (PARTITION BY
                                          nt.[name]
                                      ORDER BY
                                          (
                                              SELECT
                                                  NULL
                                          )
                                     )                                                                           AS [node_table_name_seq],
                   tp.[distribution_policy_desc]                                                                 AS [distribution_policy_name],
                   c.[name]                                                                                      AS [distribution_column],
                   nt.[distribution_id]                                                                          AS [distribution_id],
                   i.[type]                                                                                      AS [index_type],
                   i.[type_desc]                                                                                 AS [index_type_desc],
                   nt.[pdw_node_id]                                                                              AS [pdw_node_id],
                   pn.[type]                                                                                     AS [pdw_node_type],
                   pn.[name]                                                                                     AS [pdw_node_name],
                   di.name                                                                                       AS [dist_name],
                   di.position                                                                                   AS [dist_position],
                   nps.[partition_number]                                                                        AS [partition_nmbr],
                   nps.[reserved_page_count]                                                                     AS [reserved_space_page_count],
                   nps.[reserved_page_count] - nps.[used_page_count]                                             AS [unused_space_page_count],
                   nps.[in_row_data_page_count] + nps.[row_overflow_used_page_count] + nps.[lob_used_page_count] AS [data_space_page_count],
                   nps.[reserved_page_count] - (nps.[reserved_page_count] - nps.[used_page_count])
                   - ([in_row_data_page_count] + [row_overflow_used_page_count] + [lob_used_page_count])         AS [index_space_page_count],
                   nps.[row_count]                                                                               AS [row_count]
           FROM
                   sys.schemas                           s
               INNER JOIN
                   sys.tables                            t
                       ON s.[schema_id] = t.[schema_id]
               INNER JOIN
                   sys.indexes                           i
                       ON t.[object_id] = i.[object_id]
                          AND i.[index_id] <= 1
               INNER JOIN
                   sys.pdw_table_distribution_properties tp
                       ON t.[object_id] = tp.[object_id]
               INNER JOIN
                   sys.pdw_table_mappings                tm
                       ON t.[object_id] = tm.[object_id]
               INNER JOIN
                   sys.pdw_nodes_tables                  nt
                       ON tm.[physical_name] = nt.[name]
               INNER JOIN
                   sys.dm_pdw_nodes                      pn
                       ON nt.[pdw_node_id] = pn.[pdw_node_id]
               INNER JOIN
                   sys.pdw_distributions                 di
                       ON nt.[distribution_id] = di.[distribution_id]
               INNER JOIN
                   sys.dm_pdw_nodes_db_partition_stats   nps
                       ON nt.[object_id] = nps.[object_id]
                          AND nt.[pdw_node_id] = nps.[pdw_node_id]
                          AND nt.[distribution_id] = nps.[distribution_id]
                          AND i.[index_id] = nps.[index_id]
               LEFT OUTER JOIN
                   (
                       SELECT
                           *
                       FROM
                           sys.pdw_column_distribution_properties
                       WHERE
                           distribution_ordinal = 1
                   )                                     cdp
                       ON t.[object_id] = cdp.[object_id]
               LEFT OUTER JOIN
                   sys.columns                           c
                       ON cdp.[object_id] = c.[object_id]
                          AND cdp.[column_id] = c.[column_id]
           WHERE
                   pn.[type] = 'COMPUTE'),
         size
    AS (   SELECT
               [execution_time],
               [database_name],
               [schema_name],
               [table_name],
               [two_part_name],
               [node_table_name],
               [node_table_name_seq],
               [distribution_policy_name],
               [distribution_column],
               [distribution_id],
               [index_type],
               [index_type_desc],
               [pdw_node_id],
               [pdw_node_type],
               [pdw_node_name],
               [dist_name],
               [dist_position],
               [partition_nmbr],
               [reserved_space_page_count],
               [unused_space_page_count],
               [data_space_page_count],
               [index_space_page_count],
               [row_count],
               ([reserved_space_page_count] * 8.0)              AS [reserved_space_KB],
               ([reserved_space_page_count] * 8.0) / 1000       AS [reserved_space_MB],
               ([reserved_space_page_count] * 8.0) / 1000000    AS [reserved_space_GB],
               ([reserved_space_page_count] * 8.0) / 1000000000 AS [reserved_space_TB],
               ([unused_space_page_count] * 8.0)                AS [unused_space_KB],
               ([unused_space_page_count] * 8.0) / 1000         AS [unused_space_MB],
               ([unused_space_page_count] * 8.0) / 1000000      AS [unused_space_GB],
               ([unused_space_page_count] * 8.0) / 1000000000   AS [unused_space_TB],
               ([data_space_page_count] * 8.0)                  AS [data_space_KB],
               ([data_space_page_count] * 8.0) / 1000           AS [data_space_MB],
               ([data_space_page_count] * 8.0) / 1000000        AS [data_space_GB],
               ([data_space_page_count] * 8.0) / 1000000000     AS [data_space_TB],
               ([index_space_page_count] * 8.0)                 AS [index_space_KB],
               ([index_space_page_count] * 8.0) / 1000          AS [index_space_MB],
               ([index_space_page_count] * 8.0) / 1000000       AS [index_space_GB],
               ([index_space_page_count] * 8.0) / 1000000000    AS [index_space_TB]
           FROM
               base)
    SELECT
        *
    FROM
        size;

Then, this "Table space summary" query provides a list of tables and how much space each is currently using (among other information):

SELECT
    database_name,
    schema_name,
    table_name,
    distribution_policy_name,
    distribution_column,
    index_type_desc,
    COUNT(DISTINCT partition_nmbr) AS nbr_partitions,
    SUM(row_count)                 AS table_row_count,
    SUM(reserved_space_GB)         AS table_reserved_space_GB,
    SUM(data_space_GB)             AS table_data_space_GB,
    SUM(index_space_GB)            AS table_index_space_GB,
    SUM(unused_space_GB)           AS table_unused_space_GB
FROM
    dbo.vTableSizes
GROUP BY
    database_name,
    schema_name,
    table_name,
    distribution_policy_name,
    distribution_column,
    index_type_desc
ORDER BY
    table_reserved_space_GB DESC;

Here's a sample execution against one of my databases showing the table names sorted by space used.

Sample query execution showing used space by table

Upvotes: 0

Related Questions