Reputation: 4272
I have an existing table that I was populating overnight via a script that reads and converts a binary format file into a sequence of replace
statements that are piped into mysql
. The script had a bug and looped over each binary file twice, so every replace
statement was issued two times.
Now the table has auto incremented primary key, and the replace statements do not specify it. I was expecting to get two entries (with different integer primary key value). I actually don't get the repetitions.
More detail:
From describe a.b
:
Field | Type | Null | Key | Default | Extra
------|---------|------|-----|---------|---------------
id | int(11) | NO | PRI | NULL | auto_increment
name | char(8) | YES | MUL | NULL |
date | date | YES | | NULL |
size | int(11) | YES | | NULL |
A sample replace
statement is
replace into a.b (name,date,size) values ('Joe',20180105,32100);
After the script runs I look at the entries in the table, and there is only one entry for Joe
on the given date with the given size. Result from select * from a.b where name='Joe' and date=20180105
is:
id | name | date | size
-------|------|------------|------
20423 | Joe | 2018-01-05 | 32100
(Whereas I expected to see another line with a different id
and the same values in the remaining columns.)
I seem to have the answer that I want here in that, had the script run properly, each replace
statement would have only appeared once anyway, but I don't understand why (and therefore don't completely trust it).
EDIT
In response to comments, here is the result from show index from a.b
(I rounded the cardinality numbers when I brought it over):
Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment
------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+--------
b | 0 | PRIMARY | 1 | id | A | 1000000 | Null | Null | YES | BTREE |
b | 0 | n_d | 1 | name | A | 100 | Null | Null | YES | BTREE |
b | 0 | n_d | 2 | date | A | 10000 | Null | Null | YES | BTREE |
So, I think maybe I misunderstood the meaning of MUL
in the result from describe? From the documentation:
If Key is MUL, the column is the first column of a nonunique index in which multiple occurrences of a given value are permitted within the column.
It was the "nonunique" there that made me think this wasn't causing lines to get killed on the second replace. Was it instead just marking that name
by itself is not uniquely indexed (even though, in total, the n_d
index is unique)?
Upvotes: 0
Views: 1041
Reputation: 17655
I am not able to reproduce your observation (in mariadb anyway).
MariaDB [sandbox]> drop table if exists t;
Query OK, 0 rows affected (0.07 sec)
MariaDB [sandbox]> create table t
-> (id int auto_increment primary key,
-> name char(8) null default null,
-> date date null default null,
-> size int null default null
-> );
Query OK, 0 rows affected (0.12 sec)
MariaDB [sandbox]> alter table t
-> add key tk1(name);
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [sandbox]>
MariaDB [sandbox]> describe t;
+-------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | char(8) | YES | MUL | NULL | |
| date | date | YES | | NULL | |
| size | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+----------------+
4 rows in set (0.02 sec)
MariaDB [sandbox]>
MariaDB [sandbox]> replace t (name,date,size) values ('aaa','2018-01-30',1);
Query OK, 1 row affected (0.00 sec)
MariaDB [sandbox]> replace t (name,date,size) values ('aaa','2018-01-31',2);
Query OK, 1 row affected (0.00 sec)
MariaDB [sandbox]>
MariaDB [sandbox]> select * from t;
+----+------+------------+------+
| id | name | date | size |
+----+------+------------+------+
| 1 | aaa | 2018-01-30 | 1 |
| 2 | aaa | 2018-01-31 | 2 |
+----+------+------------+------+
2 rows in set (0.00 sec)
Upvotes: 1
Reputation: 5040
With replace, if a duplicate row is found (same primary key and/or unique key), the existing row is deleted and a new row is inserted. So you shouldn't see another row with the same values. If you have any unique keys on the other columns (other than the primary key), then you would see the results you found. Since you didn't show the keys for the table, this is just a guess.
From the SQL Manual:
REPLACE works exactly like INSERT , except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.
Upvotes: 0