Reputation: 668
My query duplicates rows when I import CSV file in to table:
+----------+----------------+---------------------+
| fullname | formattedvalue | recordTime |
+----------+----------------+---------------------+
| text1 | 170.01346 | 09/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text2 | 24.153432536 | 09/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text3 | 3.583432424 | 09/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text1 | 170.01346 | 08/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text2 | 24.153432536 | 08/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text3 | 3.583432424 | 08/02/2020 21:45:00 |
+----------+----------------+---------------------+
And the Query:
CREATE TEMP TABLE tmp_x
(
"fullname" varchar,
"formattedvalue" double precision,
"recordtime" timestamp
);
COPY tmp_x FROM PROGRAM 'more +1 "D:\MEAS_20200308x.csv"' (FORMAT csv, DELIMITER ',');
--UPDATE tmp_x
--SET formattedvalue = ROUND( CAST(formattedvalue as numeric), 3 );
insert into meas_kanal select * from (
select x.*
from tmp_x x
left outer join meas_kanal t on t.fullname = x. fullname AND t. recordtime = x. recordtime
where t. fullname is null AND t. recordtime is null
) as missing;
DROP TABLE tmp_x;
My logic is to check for duplicates on cloumn combination: fullname
+ recordtime
When I start the query again, it is inserting the same rows again.
Any idea where I am wrong?
EDIT 2:
I tried and this with the same problem:
INSERT INTO meas_kanal
SELECT x.*
FROM tmp_x x
LEFT OUTER JOIN meas_kanal t ON (t. fullname = x. fullname AND t. recordtime = x. recordtime)
WHERE t.fullname IS NULL AND t. recordtime IS NULL;
EDIT 3: One more fail.
INSERT INTO meas_kanal
SELECT *
FROM tmp_x
WHERE NOT EXISTS(SELECT *
FROM meas_kanal
WHERE (tmp_x.fullname=meas_kanal.fullname and
tmp_x.recordtime=meas_kanal.recordtime)
);
I think that the problem is somewhere else.
Edit 4: Possible solution
BTW I forgot to mention. I don't have a primary key. Now I make two:
CREATE TEMP TABLE tmp_x
(
"fullname" varchar,
"formattedvalue" double precision,
"recordtime" timestamp,
UNIQUE (fullname, recordtime)
);
And this insert:
insert into meas_kanal(fullname, formattedvalue,recordtime)
SELECT fullname, formattedvalue,recordtime FROM tmp_x x
ON CONFLICT DO NOTHING;
For now it's working like I expect. I will write this solution as an answer if no one gives a better solution.
Upvotes: 0
Views: 1214
Reputation: 668
I make two primary keys:
CREATE TEMP TABLE tmp_x
(
"fullname" varchar,
"formattedvalue" double precision,
"recordtime" timestamp,
UNIQUE (fullname, recordtime)
);
And this insert:
insert into meas_kanal(fullname, formattedvalue,recordtime)
SELECT fullname, formattedvalue,recordtime FROM tmp_x x
ON CONFLICT DO NOTHING;
For now it's working like I expect.
Upvotes: 2
Reputation: 4268
You could use GROUP BY
or DISTINCT
Assuming your current query results with the table you displayed
INSERT INTO meas_kanal
SELECT DISTINCT x.fullname, x.formattedvalue, x.recordtime
FROM tmp_x x
Update 1
Based on your recent comments, if you want the latest recordtime to be inserted, then use GROUP BY
INSERT INTO meas_kanal
SELECT x.fullname, x.formattedvalue, MAX(x.recordtime) recordtime
FROM tmp_x x
GROUP BY x.fullname, x.formattedvalue
Upvotes: 0