Brick
Brick

Reputation: 4272

MySQl replace with auto increment primary and MUL secondary keys

I have an existing table that I was populating overnight via a script that reads and converts a binary format file into a sequence of replace statements that are piped into mysql. The script had a bug and looped over each binary file twice, so every replace statement was issued two times.

Now the table has auto incremented primary key, and the replace statements do not specify it. I was expecting to get two entries (with different integer primary key value). I actually don't get the repetitions.

More detail:

From describe a.b:

 Field | Type    | Null | Key | Default | Extra
 ------|---------|------|-----|---------|---------------
 id    | int(11) | NO   | PRI | NULL    | auto_increment
 name  | char(8) | YES  | MUL | NULL    |
 date  | date    | YES  |     | NULL    |
 size  | int(11) | YES  |     | NULL    |

A sample replace statement is

replace into a.b (name,date,size) values ('Joe',20180105,32100);

After the script runs I look at the entries in the table, and there is only one entry for Joe on the given date with the given size. Result from select * from a.b where name='Joe' and date=20180105 is:

id     | name | date       | size
-------|------|------------|------
20423  | Joe  | 2018-01-05 | 32100

(Whereas I expected to see another line with a different id and the same values in the remaining columns.)

I seem to have the answer that I want here in that, had the script run properly, each replace statement would have only appeared once anyway, but I don't understand why (and therefore don't completely trust it).

EDIT

In response to comments, here is the result from show index from a.b (I rounded the cardinality numbers when I brought it over):

Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment
------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+--------
b     | 0          | PRIMARY  | 1            | id          | A         | 1000000     | Null     | Null   | YES  | BTREE      |
b     | 0          | n_d      | 1            | name        | A         | 100         | Null     | Null   | YES  | BTREE      |
b     | 0          | n_d      | 2            | date        | A         | 10000       | Null     | Null   | YES  | BTREE      |

So, I think maybe I misunderstood the meaning of MUL in the result from describe? From the documentation:

If Key is MUL, the column is the first column of a nonunique index in which multiple occurrences of a given value are permitted within the column.

It was the "nonunique" there that made me think this wasn't causing lines to get killed on the second replace. Was it instead just marking that name by itself is not uniquely indexed (even though, in total, the n_d index is unique)?

Upvotes: 0

Views: 1041

Answers (2)

P.Salmon
P.Salmon

Reputation: 17655

I am not able to reproduce your observation (in mariadb anyway).

MariaDB [sandbox]> drop table if exists t;
Query OK, 0 rows affected (0.07 sec)

MariaDB [sandbox]> create table t
    -> (id    int auto_increment primary key,
    -> name   char(8) null default null,
    -> date   date  null  default null,
    -> size   int  null default null
    -> );
Query OK, 0 rows affected (0.12 sec)

MariaDB [sandbox]> alter table t
    -> add key tk1(name);
Query OK, 0 rows affected (0.04 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [sandbox]>
MariaDB [sandbox]> describe t;
+-------+---------+------+-----+---------+----------------+
| Field | Type    | Null | Key | Default | Extra          |
+-------+---------+------+-----+---------+----------------+
| id    | int(11) | NO   | PRI | NULL    | auto_increment |
| name  | char(8) | YES  | MUL | NULL    |                |
| date  | date    | YES  |     | NULL    |                |
| size  | int(11) | YES  |     | NULL    |                |
+-------+---------+------+-----+---------+----------------+
4 rows in set (0.02 sec)

MariaDB [sandbox]>
MariaDB [sandbox]> replace t (name,date,size) values ('aaa','2018-01-30',1);
Query OK, 1 row affected (0.00 sec)

MariaDB [sandbox]> replace t (name,date,size) values ('aaa','2018-01-31',2);
Query OK, 1 row affected (0.00 sec)

MariaDB [sandbox]>
MariaDB [sandbox]> select * from t;
+----+------+------------+------+
| id | name | date       | size |
+----+------+------------+------+
|  1 | aaa  | 2018-01-30 |    1 |
|  2 | aaa  | 2018-01-31 |    2 |
+----+------+------------+------+
2 rows in set (0.00 sec)

Upvotes: 1

Sloan Thrasher
Sloan Thrasher

Reputation: 5040

With replace, if a duplicate row is found (same primary key and/or unique key), the existing row is deleted and a new row is inserted. So you shouldn't see another row with the same values. If you have any unique keys on the other columns (other than the primary key), then you would see the results you found. Since you didn't show the keys for the table, this is just a guess.

From the SQL Manual:

REPLACE works exactly like INSERT , except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

Upvotes: 0

Related Questions