Reputation: 3093
Query -
sel TableName, DatabaseName, sum(CurrentPerm/(1024*1024*1024)) as Size_in_GB
from dbc.tablesize
group by 1,2
order by GB desc
result -
+-----------+--------+------------+
| TableName | DBName | Size_in_GB |
+-----------+--------+------------+
| WRP | A | 28,350.01 |
| CPC | B | 19,999.37 |
| SDF | C | 13,263.67 |
| DB1400 | D | 13,200.26 |
+-----------+--------+------------+
From above simple query I can see that table WRP of database A is near 28350 GB
Now I am trying to join another table dbc.indices
to use the column IndexType
for filtering but now the Size_in_GB changes for all tables.
sel a.TableName,a.DatabaseName, sum(CurrentPerm/(1024*1024*1024)) as Size_in_GB from dbc.tablesize a
join dbc.indices b on a.TableName = b.TableName and a.DatabaseName=b.DatabaseName
--where b.indexType='P'
group by 1,2
order by Size_in_GB desc
Result is this -
+-----------+--------+------------+
| TableName | DBName | Size_in_GB |
+-----------+--------+------------+
| WRP | A | 56,700.02 |
| CPC | B | 39,998.74 |
| DB1400 | D | 39,600.78 |
+-----------+--------+------------+
Now the same table is twice the size i.e. WRP is 56700 GB. (Similar for other tables)
I am not sure what's wrong with the logic I'm using for join.
P.S - My aim is to find all the tables which are greater than 100GB in Size and have indexType as 'P'
EDIT - Sharing relevant columns from DBC.INDICES
table
+--------------+------------+-------------+-----------+------------+---------------+------------+----------------+
| DatabaseName | TableName | IndexNumber | IndexType | UniqueFlag | IndexName | ColumnName | ColumnPosition |
+--------------+------------+-------------+-----------+------------+---------------+------------+----------------+
| Some DB | Some Table | 1 | P | N | IndexNamehere | ColumnA | 1 |
+--------------+------------+-------------+-----------+------------+---------------+------------+----------------+
Upvotes: 0
Views: 106
Reputation: 60482
dbc.IndidesV
(never use the old deprecated non-V-views) has one row per column per index.
You can simply add a condition to restrict it to a single row: where IndexType = 'P' and ColumnPosition = 1
And it's more efficient to do an early aggregation, i.e. aggregate before the join:
select t.*
from
(
select TableName, DatabaseName,
sum(CurrentPerm/(1024*1024*1024)) as Size_in_GB
from dbc.TableSizeV
group by 1,2
having Size_in_GB > 100
) as dt
join dbc.IndicesV b
on a.TableName = b.TableName
and a.DatabaseName=b.DatabaseName
where IndexType = 'P'
and ColumnPosition = 1
order by Size_in_GB desc;
But why do you want to filter for that IndexType=P
, don't you care about other objects > 100GB (NoPI/Columnar tables, Join Indices)? Btw, this doesn't return all tables with a PI, only IndexNumber=1
does.
Depending on your needs you might better join to dbc.TablesV
instead.
Upvotes: 0
Reputation: 95062
P.S - My aim is to find all the tables which are greater than 100GB in Size and have indexType as 'P'
If you only want to find certain tables for which an index exists, you shouln't join at all. Use EXISTS
instead. This puts your condition in the WHERE
or HAVING
clause where it belongs and you don't have issues with your condition duplicating your records (in your case: it doesn't matter anymore when a table has more than one matching index).
select tablename, databasename, sum(currentperm/(1024*1024*1024)) as size_in_gb
from dbc.tablesize ts
group by tablename, databasename
having sum(currentperm/(1024*1024*1024)) > 100
and exists
(
select *
from dbc.indices i
where i.tablename = ts.tablename and i.databasename = ts.databasename
and i.indexType = 'P'
)
order by Size_in_GB desc;
Upvotes: 0
Reputation: 93754
Probably your key is duplicated in dbc.indices
table. For a single TableName
,dbc.indices
table has more then one entry so when you join dbc.tablesize
table records got duplicated so the SUM
is applied on duplicate records so the mistake in calculation.
Try this way
SELECT a.TableName,
a.DatabaseName,
Sum(CurrentPerm / ( 1024 * 1024 * 1024 )) AS Size_in_GB
FROM dbc.tablesize a
JOIN (SELECT DISTINCT b.TableName,
b.DatabaseName
FROM dbc.indices b
--where b.indexType='P'
) b
ON a.TableName = b.TableName
AND a.DatabaseName = b.DatabaseName
GROUP BY a.TableName,
a.DatabaseName
ORDER BY Size_in_GB DESC
Upvotes: 1
Reputation: 1270573
What is confusing?
You clearly have tables that have multiple indexes. Each index is going to result in the table appearing more than one time for the aggregation.
For what you want:
My aim is to find all the tables which are greater than 100GB in Size and have indexType as 'P'
I would suggest moving the index comparison to the where
clause:
select t.TableName, t.DatabaseName,
sum(tCurrentPerm/(1024*1024*1024)) as Size_in_GB
from dbc.tablesize t
where exists (select 1
from dbc.indices i
where t.TableName = i.TableName and t.DatabaseName = i.DatabaseName and
i.indexType = 'P'
)
group by 1,2
order by Size_in_GB desc
You can add having Size_in_GB > 100
before the order by
if you want to add that filter as well.
Upvotes: 2