mjoy
mjoy

Reputation: 690

Get count of rows from multiple tables Redshift SQL?

I have a redshift database that is being updated with new tables so I can't just manually list the tables I want. I want to get a count of the rows of all the tables from my query. So far I have:

select 'SELECT ''' || table_name || ''' as table_name, count(*) As con ' ||
       'FROM ' || table_name || 
        CASE WHEN lead(table_name) OVER (order by table_name ) IS NOT NULL 
        THEN ' UNION ALL ' END
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME LIKE '%results%'

but when I do this I get the error:

Specified types or functions (one per INFO message) not supported on Redshift tables.

I've searched a lot but I can't seem to find a solution for my problem. Any help would be greatly appreciated. Thanks!

EDIT: I've changed my approach to this and decided to use a for loop in R to get the row counts of each but I'm running into the issue that 'row_counts' is only saving one number, not the count of each row like I want. Here is the code:

schema <- "x"
table_prefix <- "results"
geos <- ad_districts %>% filter(geo != "geo")
row_count <- list()
i = 1

for (geo in geos){
  table_name <- paste0(schema, ".", table_prefix, geo)
  row_count[[i]] <- dbGetQuery(con, 
                             paste("SELECT COUNT(*) FROM", table_name))
  i = i + 1 
}

Upvotes: 0

Views: 7909

Answers (4)

Sathish Kumar
Sathish Kumar

Reputation: 1

The error occurs because Amazon Redshift does not support certain PostgreSQL functions or features, such as RANK(), in combination with certain query types when processing metadata. Additionally, Redshift does not support the direct use of certain features for dynamic SQL generation as written.

Upvotes: 0

Nitin Jain
Nitin Jain

Reputation: 1

SELECT ' Select count(*) , '''+ tablename + '''  from   '+'"' + tablename +'"' +'  Union ALL  '
FROM pg_table_def
GROUP BY tablename

Above query eliminates any table name with space. Remove UNION ALL at the end of the query and query will be ready to be executed.

Upvotes: 0

Victor Di Leo
Victor Di Leo

Reputation: 82

[EDIT] - I think this is the root cause - some sql functions are only supported on the leader node. Try connecting to that node and re-run your SQL. https://docs.aws.amazon.com/redshift/latest/dg/c_sql-functions-leader-node.html

Hope this helps.

select 'select count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' ;' as sql_text
from information_schema.tables
;

[EDIT - refined this a bit to generate a series of statements that can be run at once]

select rownum, case when rownum > 1 then sql_text else replace(sql_text, 'union all', '') end as sql_text
from
(
select rank() over (order by sql_text DESC) as rownum,
        sql_text
from
(
select 'select ''' || table_schema || ' ' || table_name || ''' , count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' union all ' as sql_text
from information_schema.tables
where table_schema = 'public'
order by table_schema, table_name
)X
)Y 
order by rownum desc ;

Upvotes: 0

demircioglu
demircioglu

Reputation: 3465

Your query is doing a select * for all tables, this will take a lot of time and resources. Instead use a system table to get the same info

select name, sum(rows) as rows
from stv_tbl_perm
where name like '%results%'
group by 1

Upvotes: 0

Related Questions