Reputation: 22661
We do have a small data warehouse in PostgreSQL database and I have to document all the tables.
I thought I can add a comment to every column and table and use pipe "|" separator to add more attributes. Then I can use information schema and array function to get documentation and use any reporting software to create desired output.
select
ordinal_position,
column_name,
data_type,
character_maximum_length,
numeric_precision,
numeric_scale,
is_nullable,
column_default,
(string_to_array(descr.description,'|'))[1] as cs_name,
(string_to_array(descr.description,'|'))[2] as cs_description,
(string_to_array(descr.description,'|'))[3] as en_name,
(string_to_array(descr.description,'|'))[4] as en_description,
(string_to_array(descr.description,'|'))[5] as other
from
information_schema.columns columns
join pg_catalog.pg_class klass on (columns.table_name = klass.relname and klass.relkind = 'r')
left join pg_catalog.pg_description descr on (descr.objoid = klass.oid and descr.objsubid = columns.ordinal_position)
where
columns.table_schema = 'data_warehouse'
order by
columns.ordinal_position;
It is a good idea or is there better approach?
Upvotes: 1
Views: 538
Reputation: 22661
If anyone interested, here is what I've used for initial load for my small documentation project. Documentation is in two tables, one for describing tables and one for describing columns and constraints. I appreciate any feedback.
/* -- Initial Load - Tables */
drop table dw_description_table cascade;
create table dw_description_table (
table_description_key serial primary key,
physical_full_name character varying,
physical_schema_name character varying,
physical_table_name character varying,
Table_Type character varying, -- Fact Dimension ETL Transformation
Logical_Name_CS character varying,
Description_CS character varying,
Logical_Name_EN character varying,
Description_EN character varying,
ToDo character varying,
Table_Load_Type character varying, --Manually TruncateLoad AddNewRows
Known_Exclusions character varying,
Table_Clover_Script character varying
);
insert into dw_description_table (physical_full_name, physical_schema_name, physical_table_name) (
select
table_schema || '.' || table_name as physical_full_name,
table_schema,
table_name
from
information_schema.tables
where
table_name like 'dw%' or table_name like 'etl%'
)
/* -- Initial Load - Columns */
CREATE TABLE dw_description_column (
column_description_key serial,
table_description_key bigint,
physical_full_name text,
physical_schema_name character varying,
physical_table_name character varying,
physical_column_name character varying,
ordinal_position character varying,
column_default character varying,
is_nullable character varying,
data_type character varying,
logical_name_cs character varying,
description_cs character varying,
logical_name_en character varying,
description_en character varying,
derived_rule character varying,
todo character varying,
pk_name character varying,
fk_name character varying,
foreign_table_name character varying,
foreign_column_name character varying,
is_primary_key boolean,
is_foreign_key boolean,
CONSTRAINT dw_description_column_pkey PRIMARY KEY (column_description_key ),
CONSTRAINT fk_dw_description_table_key FOREIGN KEY (table_description_key)
REFERENCES dw_description_table (table_description_key) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
insert into dw_description_column (
table_description_key ,
physical_full_name ,
physical_schema_name ,
physical_table_name ,
physical_column_name ,
ordinal_position ,
column_default ,
is_nullable ,
data_type ,
logical_name_cs ,
description_cs ,
logical_name_en ,
description_en ,
derived_rule ,
todo ,
pk_name ,
fk_name ,
foreign_table_name ,
foreign_column_name ,
is_primary_key ,
is_foreign_key )
(
with
dw_constraints as (
SELECT
tc.constraint_name,
tc.constraint_schema || '.' || tc.table_name || '.' || kcu.column_name as physical_full_name,
tc.constraint_schema,
tc.table_name,
kcu.column_name,
ccu.table_name AS foreign_table_name,
ccu.column_name AS foreign_column_name,
TC.constraint_type
FROM
information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu ON (tc.constraint_name = kcu.constraint_name and tc.table_name = kcu.table_name)
JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name
WHERE
constraint_type in ('PRIMARY KEY','FOREIGN KEY')
AND tc.constraint_schema = 'bizdata'
and (tc.table_name like 'dw%' or tc.table_name like 'etl%')
group by
tc.constraint_name,
tc.constraint_schema,
tc.table_name,
kcu.column_name,
ccu.table_name ,
ccu.column_name,
TC.constraint_type
)
select
dwdt.table_description_key,
col.table_schema || '.' || col.table_name || '.' || col.column_name as physical_full_name,
col.table_schema as physical_schema_name,
col.table_name as physical_table_name,
col.column_name as physical_column_name,
col.ordinal_position,
col.column_default,
col.is_nullable,
col.data_type,
null as Logical_Name_CS ,
null as Description_CS ,
null as Logical_Name_EN,
null as Description_EN ,
null as Derived_Rule ,
null as ToDo,
dwc1.constraint_name pk_name,
dwc2.constraint_name as fk_name,
dwc2.foreign_table_name,
dwc2.foreign_column_name,
case when dwc1.constraint_name is not null then true else false end as is_primary_key,
case when dwc2.constraint_name is not null then true else false end as foreign_key
from
information_schema.columns col
join dw_description_table dwdt on (col.table_schema || '.' || col.table_name = dwdt.physical_full_name )
left join dw_constraints dwc1 on ((col.table_schema || '.' || col.table_name || '.' || col.column_name) = dwc1.physical_full_name and dwc1.constraint_type = 'PRIMARY KEY')
left join dw_constraints dwc2 on ((col.table_schema || '.' || col.table_name || '.' || col.column_name) = dwc2.physical_full_name and dwc2.constraint_type = 'FOREIGN KEY')
where
col.table_name like 'dw%' or col.table_name like 'etl%'
)
Upvotes: 1
Reputation: 108199
Unless you must include descriptions of the system tables, I wouldn't try to shoehorn your descriptions into pg_catalog.pg_description. Make your own table. That way you get to keep the columns as columns, and not have to use clunky string functions.
Alternatively, consider adding specially formatted comments to your master schema file, along the lines of javadoc. Then write a tool to extract those comments and create a document. That way the comments stay close to the thing they're commenting, and you don't have to mess with the database at all to produce the report. For example:
--* Used for authentication.
create table users
(
--* standard Rails-friendly primary key. Also an example of
--* a long comment placed before the item, rather than on the
--* the same line.
id serial primary key,
name text not null, --* Real name (hopefully)
login text not null, --* Name used for authentication
...
);
Your documentation tool reads the file, looks for the --*
comments, figures out what comments go with what things, and produces some kind of report, e.g.:
table users: Used for authentication
id: standard Rails-friendly primary key. Also an example of a
long comment placed before the item, rather than on the same
line.
name: Real name
login: Name used for authentication
You might note that with appropriate comments, the master schema file itself is a pretty good report in its own right, and that perhaps nothing else is needed.
Upvotes: 3