Stefan Gustafsson
Stefan Gustafsson

Reputation: 26

Snowflake: How can I find out which internal table stages consumes most storage?

In my snowflake account I can see that there is a lot of storage used by stages. I can see this for example using the following query:

select *
from table(information_schema.stage_storage_usage_history(dateadd('days',-10,current_date()),current_date()));

There are no named stages in the databases. All used storage must be in internal stages. How can I find out which internal stages consumes most storage?

If I know the name of the table I can list all files in the table stage using something like this:

list @SCHEMANAME.%TABLENAME;

My problem is that there are hundreds of tables in the databases and I have no idea which tables to query.

Upvotes: 0

Views: 1581

Answers (4)

Levin
Levin

Reputation: 2015

There is a post from Nov 2023 in the community site that I think answers your question from 2020 as far as named internal stages go. I don't think it will get table stages but might be an easy modification.

I added a single filter to that query. I added and stage_catalog = 'FOO_DB'. Without that it iterates thru every file in every internal stage in the snowflake account.

I don't think that Snowflake provides easier approach.

/*
from:
https://community.snowflake.com/s/article/How-to-list-how-much-storage-is-consumed-by-the-internal-stages
*/
declare
  res RESULTSET;
  rpt VARIANT;
  name VARCHAR;
  err varchar := '';
  total_size NUMBER;
  res_query VARCHAR DEFAULT 'select KEY STAGE_NAME,VALUE TOTAL_BYTES from table(flatten(parse_json(?))) order by 2 DESC';
  c1 cursor for select concat_ws( '.', stage_catalog, stage_schema, stage_name ) name 
                 from snowflake.account_usage.stages 
                 where stage_type = 'Internal Named' and deleted is NULL
                 and stage_catalog = 'FOO_DB';
begin
  rpt := object_construct();
  for record in c1 do
      begin
          name := record.name;
          res := (execute immediate 'ls @' || name);
          let c2 cursor for res;
          total_size := 0;
          for inner_record in c2 do
              begin
                  total_size := total_size + inner_record."size";
              EXCEPTION 
                  WHEN OTHER THEN
                      err := concat_ws('\n', err , SQLERRM );
              end;
           end for;
           rpt := object_insert( rpt, name, total_size );
      EXCEPTION 
          WHEN OTHER THEN
              err := concat_ws('\n', err , SQLERRM );
      end;
  end for;
  res := (execute immediate :res_query using (rpt));
  return table(res);
  --return err;
end;

Upvotes: 0

Palash Chatterjee
Palash Chatterjee

Reputation: 196

If I understood correctly,you want to do something with stages to reduce the overall billing or storage size

Snowflake Stages

  1. 'Internal Named' and 'External' stages : These are the only stages which can be altered or dropped and controlled by user
  2. User Stage and Table stages are one which can not be altered or dropped, managed by snowflake

So even if you could identify those Table and User specific stages , you can not drop those.

Snowflake Storage consumption includes below three components for billing

   1. Databases size
   2. Stages size
   3. Fail Safe size

The size of the storage occupied can be visible (Only when you have Accountadmin role or MONITOR privs) under below location webUI tab

Account Tab ---> Usage --> Average Storage Used  

Note : on the account tab, No DB object wise storage billing details available

So how you can see the storage consumption of the associated tables (including their fail safe and time travel bit) and stage details

 select * from <DB Name>."INFORMATION_SCHEMA"."TABLE_STORAGE_METRICS"
 select * from <DB Name>."INFORMATION_SCHEMA"."STAGES"

Hope clarification helped

Thanks

Palash Chatterjee

Upvotes: 0

Iqra Ijaz
Iqra Ijaz

Reputation: 281

You can use the view STAGES in Information schema or account usage to get the stages. Do note that Account usage has higher retention period than information schema and data retrieval is faster. You can read more here

Upvotes: 0

Mike Walton
Mike Walton

Reputation: 7369

There is an ACCOUNT_USAGE view called STAGE_STORAGE_USAGE_HISTORY in the Snowflake database/share that will give you everything, including internal stages. I would use this over the information_schema view, since that is limited to what your role currently has access to.

https://docs.snowflake.com/en/sql-reference/account-usage/stage_storage_usage_history.html

Upvotes: 0

Related Questions