Reputation: 1357
I have a Postgresql 9.1 database with couple hundred schemas. All have same structure, just different data. I need to perform a select on a table and get data from each schema. Unfortunately I haven't found a decent way to do it.
I tried setting the search path to schema_1,schema_2, etc and then perform a select on the table but it only selects data from the first schema.
The only way I managed to do it so far is by generating a big query like:
select * from schema_1.table
union
select * from schema_2.table
union
(...another 100 lines....)
Is there any other way to do this in a more reasonable fashion? If this is not possible, can I at least find out which of the schemas has records in that table without performing this select?
Upvotes: 8
Views: 7764
Reputation: 1581
Different schemas mean different tables, so if you have to stick to this structure, it'll mean unions, one way or the other. That can be pretty expensive. If you're after partitioning through the convenience of search paths, it might make sense to reverse your schema:
Store a big table in the public schema, and then provision views in each of the individual schemas.
Check out this sqlfiddle that demonstrates my concept:
http://sqlfiddle.com/#!12/a326d/1
Also pasted inline for posterity, in case sqlfiddle is inaccessible:
Schema:
CREATE SCHEMA customer_1;
CREATE SCHEMA customer_2;
CREATE TABLE accounts(id serial, name text, value numeric, customer_id int);
CREATE INDEX ON accounts (customer_id);
CREATE VIEW customer_1.accounts AS SELECT id, name, value FROM public.accounts WHERE customer_id = 1;
CREATE VIEW customer_2.accounts AS SELECT id, name, value FROM public.accounts WHERE customer_id = 2;
INSERT INTO accounts(name, value, customer_id) VALUES('foo', 100, 1);
INSERT INTO accounts(name, value, customer_id) VALUES('bar', 100, 1);
INSERT INTO accounts(name, value, customer_id) VALUES('biz', 150, 2);
INSERT INTO accounts(name, value, customer_id) VALUES('baz', 75, 2);
Queries:
SELECT SUM(value) FROM public.accounts;
SET search_path TO 'customer_1';
SELECT * FROM accounts;
SET search_path TO 'customer_2';
SELECT * FROM accounts;
Results:
425
1 foo 100
2 bar 100
3 biz 150
4 baz 75
Upvotes: 2
Reputation: 30324
Here's one approach. You will need to pre-feed it all the schema names you are targeting. You could change this to just loop through all the schemas as Pavel shows if you know you want every schema. In my example I have three schemas that I care about each containing a table called bar. The logic will run a select on each schema's bar table and insert the value into a result table. At the end you have a table with all the data from all the tables. You could change this to update, delete, or do DDL. I chose to keep it simple and just collect the data from each table in each schema.
--START SETUP AKA Run This Section Once
create table schema3.bar(bar_id SERIAL PRIMARY KEY,
bar_name VARCHAR(50) NOT NULL);
insert into schema1.bar(bar_name) select 'One';
insert into schema2.bar(bar_name) select 'Two';
insert into schema3.bar(bar_name) select 'Three';
--END SETUP
DO $$
declare r record;
DECLARE l_id INTEGER = 1;
DECLARE l_schema_name TEXT;
begin
drop table if exists public.result;
create table public.result (bar_id INTEGER, bar_name TEXT);
drop table if exists public.schemas;
create table public.schemas (id serial PRIMARY KEY, schema_name text NOT NULL);
INSERT INTO public.schemas(schema_name)
VALUES ('schema1'),('schema2'),('schema3');
for r in select *
from public.schemas
loop
raise notice '%', r;
SELECT schema_name into l_schema_name
FROM public.schemas
WHERE id = l_id;
raise notice '%', l_schema_name;
EXECUTE 'set search_path TO ' || l_schema_name;
EXECUTE 'INSERT into public.result(bar_id, bar_name) select bar_id, bar_name from ' || l_schema_name || '.bar';
l_id = l_id + 1;
end loop;
end; $$;
--DEBUG
select * from schema1.bar;
select * from schema2.bar;
select * from schema3.bar;
select * from public.result;
select * from public.schemas;
--CLEANUP
--DROP TABLE public.result;
--DROP TABLE public.schemas;
Upvotes: 0
Reputation: 45795
If you have to know some about data in tables, you have to do SELECT. There is no any other way. Schema is just logical addressing - for your case is important, so you use lot of tables, and you have to do massive UNION.
search_path works as expected. It has no meaning - return data from mentioned schemes, but it specify a order for searching not fully qualified table. Searching ends on first table, that has requested name.
Attention: massive unions can require lot of memory.
you can use a dynamic SQL and stored procedures with temp table:
postgres=# DO $$
declare r record;
begin
drop table if exists result;
create temp table result as select * from x.a limit 0; -- first table;
for r in select table_schema, table_name
from information_schema.tables
where table_name = 'a'
loop
raise notice '%', r;
execute format('insert into result select * from %I.%I',
r.table_schema,
r.table_name);
end loop;
end; $$;
result:
NOTICE: (y,a) NOTICE: (x,a) DO postgres=# select * from result; a ---- 1 2 3 4 5 ..
Upvotes: 1