How to properly index a linking table for many-to-many connection in MySQL?

serg picture serg · Feb 20, 2009 · Viewed 14.5k times · Source

Lets say I have a simple many-to-many table between tables "table1" and "table2" that consists from two int fields: "table1-id" and "table2-id". How should I index this linking table?

I used to just make a composite primary index (table1-id,table2-id), but I read that this index might not work if you change order of the fields in the query. So what's the optimal solution then - make independent indexes for each field without a primary index?

Thanks.

Answer

Quassnoi picture Quassnoi · Feb 20, 2009

It depends on how you search.

If you search like this:

/* Given a value from table1, find all related values from table2 */
SELECT *
FROM table1 t1
JOIN table_table tt ON (tt.table_1 = t1.id)
JOIN table2 t2 ON (t2.id = tt.table_2)
WHERE t1.id = @id

then you need:

ALTER TABLE table_table ADD CONSTRAINT pk_table1_table2 (table_1, table_2)

In this case, table1 will be leading in NESTED LOOPS and your index will be usable only when table1 is indexed first.

If you search like this:

/* Given a value from table2, find all related values from table1 */
SELECT *
FROM table2 t2
JOIN table_table tt ON (tt.table_2 = t2.id)
JOIN table1 t1 ON (t1.id = tt.table_1)
WHERE t2.id = @id

then you need:

ALTER TABLE table_table ADD CONSTRAINT pk_table1_table2 (table_2, table_1)

for the reasons above.

You don't need independent indices here. A composite index can be used everywhere where a plain index on the first column can be used. If you use independent indices, you won't be able to search efficiently for both values:

/* Check if relationship exists between two given values */
SELECT 1
FROM table_table
WHERE table_1 = @id1
  AND table_2 = @id2

For a query like this, you'll need at least one index on both columns.

It's never bad to have an additional index for the second field:

ALTER TABLE table_table ADD CONSTRAINT pk_table1_table2 PRIMARY KEY (table_1, table_2)
CREATE INDEX ix_table2 ON table_table (table_2)

Primary key will be used for searches on both values and for searches based on value of table_1, additional index will be used for searches based on value of table_2.