kishan keswani
kishan keswani

Reputation: 65

Clustered Indexing in MySQL

I am learning indexing in Databases . According to GeeksforGeeks (https://www.geeksforgeeks.org/indexing-in-databases-set-1/) , Index file is created for clustered index . There is a diagram showing index file having semesters 1-8 .

But while reading https://use-the-index-luke.com/blog/2014-01/unreasonable-defaults-primary-key-clustering-key , it is mentioned

If a table has a clustered index, it basically means the index is the table

I want to know whether an index file is generated in clustered index and non clustered index or not ?

Also , can we see the index file on a table ,i.e, what it stores ?

Upvotes: 3

Views: 2438

Answers (3)

O. Jones
O. Jones

Reputation: 108841

This business of clustered indexes is, from the point of view of a MySQL user, an implementation detail.

The information in a table has to be stored somewhere on a storage device like an SSD or a hard disk drive. A lot depends on the version of the database server and the particular access method (InnoDB, MyISAM, ...) in use.

Some of the information is metadata: the description of the table. MySql often stashes that in a table.frm file. The information in the rows is often stored in a container file tablename.ibd. These files should only be interpreted by the MySQL server or other software built for the purpose; a typical standalone program can't make sense of them especially if the MySQL server is active. In other words: if you write a program to change any of those files you will corrupt your database and lose the contents of your tables.

Inside the container files are some data structures accessed by MySQL's access-method code. Often tables having primary keys are stored as if they were indexes with all the data for each row hanging off each the index entry. That data structure is a clustered index.

You can't necessarily look at the file system of the machine running the MySQL server and point to a particular file containing an index, or a clustered index, or whatever. With InnoDB you can instruct MySQL to put data in particular files in the file system using TABLESPACE commands. But as of MySQL 8, there's no way to put an index in its own tablespace.

(Database administrators for products like Oracle and MS SQL Server Enterprise Edition work with moving tablespaces around to different disk drives to optimize and parallelize access. That's not as big a deal in MySQL until your tables start taking significant fractions of your disk drive space.)

Upvotes: 1

The Impaler
The Impaler

Reputation: 48865

I want to know whether an index file is generated in clustered index and non clustered index or not ?

An InnoDB table (the default engine type) in MySQL is always clustered. That means the index stores all the table columns; there's no need for a separate "heap" table. If you used the old MyIsam engine, then the table would have had a heap, plus any extra indexes on top of it.

Also , can we see the index file on a table ,i.e, what it stores ?

The "primary" index stores all the columns of the table: the key columns, and the non-key columns. "Secondary" indexes can be more selective and only store a subset of the columns; but the primary one has everything.

Upvotes: 0

Rick James
Rick James

Reputation: 142528

MySQL has essentially one indexing method: BTree. (Yeah, there are also Spatial and Fulltext, but that is another discussion.)

Once you understand how a BTree works (see Wikipedia), we can discuss what goes into the leaf nodes in InnoDB.

Case 1: The "data" BTree contains all the columns and is sorted according to the PRIMARY KEY. In MySQL, the PK is by definition, "UNIQUE" and "Clustered". (Other vendors have other options.)

Case 2: A 'secondary' INDEX is stored in a separate BTree. In the leaf nodes are (1) the column(s) defined in the secondary index, plus copies of the PK column(s). To complete a SELECT using a secondary index, it must first fetch the PK using the index BTree, then fetch the data via the data BTree. (If the index is "covering", this second step is not needed.)

There is no "Rownum" in MySQL.

The BTrees are actually B+Trees, thereby making range scans more efficient.

InnoDB puts all the BTrees for a given table (one for data+PK, one for each secondary index) into some tablespace. The tablespace is either the generic one (ibdata1 file), a table-specific one (the file tablename.ibd), or (in newer versions) a "tablespace" file that can contain several tables.

Caveat: What I have described applies to MySQL's InnoDB, and is probably not correct for any other engine.

I don't know of a good tool for inspecting InnoDB's BTrees without getting into gory details. For a Percona version:

SELECT  i.INDEX_NAME as Index_Name,
                IF(ROWS_READ IS NULL, 'Unused',
                    IF(ROWS_READ > 2e9, 'Overflow', ROWS_READ)) as Rows_Read
            FROM (
                SELECT DISTINCT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
                    FROM information_schema.STATISTICS
                 ) i
            LEFT JOIN information_schema.INDEX_STATISTICS s
                     ON i.TABLE_SCHEMA = s.TABLE_SCHEMA
                    AND i.TABLE_NAME = s.TABLE_NAME
                    AND i.INDEX_NAME = s.INDEX_NAME
            WHERE i.TABLE_SCHEMA = ?
              AND i.TABLE_NAME = ?
            ORDER BY IF(i.INDEX_NAME = 'PRIMARY', 0, 1)

For MySQL (Oracle):

SELECT  last_update,
                n_rows,
                'Data & PK' AS 'Type',
                clustered_index_size * 16384 AS Bytes,
                ROUND(clustered_index_size * 16384 / n_rows) AS 'Bytes/row',
                clustered_index_size AS Pages,
                ROUND(n_rows / clustered_index_size) AS 'Rows/page'
        FROM mysql.innodb_table_stats
        WHERE ( ( database_name = ? AND table_name = ? )
          OR    ( database_name = LOWER(?) AND table_name = LOWER(?) 
    UNION
        SELECT  last_update,
                n_rows,
                'Secondary Indexes' AS 'BTrees',
                sum_of_other_index_sizes * 16384 AS Bytes,
                ROUND(sum_of_other_index_sizes * 16384 / n_rows) AS 'Bytes/row',
                sum_of_other_index_sizes AS Pages,
                ROUND(n_rows / sum_of_other_index_sizes) AS 'Rows/page'
        FROM mysql.innodb_table_stats
        WHERE ( ( database_name = ? AND table_name = ? )
          OR    ( database_name = LOWER(?) AND table_name = LOWER(?) 
          AND sum_of_other_index_sizes > 0

Upvotes: 4

Related Questions