Reputation: 967
I have a tab separated CSV file in D:\DataSet\business_names_202007/businessDataSet.csv
with 600k records (which may increase). I want to load entire data into below postgresql table.
PostgreSQL table:
CSV file structure:
As you can see above structure, column count differs between CSV file and db ("transform_business" column extra in DB). while loading data we need to add this as well, the value for this column is given below.
"transform_business" column value in db is same as "BN_NAME" in the DataSet with following changes: Convert to UPPERCASE and all spaces should be removed between words
eg:
BN_NAME: Melbourne Collision Repair
transform_business: MELBOURNECOLLISIONREPAIR
Tool: Dbeaver
DB Schema: testDev
tablename: testdevtable
Temporarily formed copt statement:
COPY testdevtable(register_name,bn_name,bn_status,transform_business)
FROM 'D:\DataSet\business_names_202007/businessDataSet.csv' DELIMITER E’\t’ CSV HEADER;
Upvotes: 0
Views: 2904
Reputation: 14861
Perhaps you have omitted a valuable piece of information: what Postgres version are you running? If you have or can update to version 12 you can redefine your table so that the column transform_business is a generated column. Then for example:
create table table_name(
id bigint generated always as identity
, register_name text
, bn_name text
, status text
, transform_name text generated always as ( upper(replace(bn_name,' ',''))) stored
) ;
If you need to maintain the current data then you could:
alter table table_name drop column transform_name;
alter table table_name add transform_name text generated always as ( upper(replace(bn_name,' ',''))) stored;
This option would be rather slow, especially if table is large, but it is a one-time process. Either option would then give you the "constraint" that transform_name could not be updated directly, but would bu automatically updated when bn_name is updated.
Then your copy command would just load register_name, bn_name, and status.
Upvotes: 0
Reputation: 19570
If it where me I would add a ON INSERT
trigger to business table that transforms bn_name
--> transform_business
during the INSERT
. Then leave transform_business
out of the COPY
. The second option is to do the transformation before you load it and add transform_business
and the new data to the CSV file.
An example SQL code snippet that will do the transformation:
SELECT upper(replace('Melbourne Collision Repair Centre Mentone', ' ', ''));
upper
---------------------------------------
MELBOURNECOLLISIONREPAIRCENTREMENTONE
Another option is to just COPY
(register_name,bn_name,bn_status) into the database(with no trigger on table) and then run:
UPDATE business SET transform_business = upper(replace(bn_name, ' ', ''));
Not sure what happens after that, whether the transform_business value is going to be entered with new data or not. If the user/application is not going to enter it then I think you are back to a trigger on business table that runs upper(replace(bn_name, ' ', ''))
.
Upvotes: 0