I have two innodb tables:
articles
id | title | sum_votes
------------------------------
1 | art 1 | 5
2 | art 2 | 8
3 | art 3 | 35
votes
id | article_id | vote
------------------------------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 2
4 | 2 | 10
5 | 2 | -2
6 | 3 | 10
7 | 3 | 15
8 | 3 | 12
9 | 3 | -2
When a new record is inserted into the votes
table, I want to update the sum_votes
field in articles
table by calculating the sum of all votes.
Which way is more efficient, if the SUM() calculation itself is a very heavy one (votes
table has 700K records).
1. Creating a trigger
CREATE TRIGGER `views_on_insert`
AFTER INSERT
ON `votes`
FOR EACH ROW
BEGIN
UPDATE `articles` SET
sum_votes = (
SELECT SUM(`vote`)
FROM `votes`
WHERE `id` = NEW.article_id
)
WHERE `id` = NEW.article_id;
END;
2. Using two queries in my application
SELECT SUM(`vote`) FROM `votes` WHERE `article_id` = 1;
UPDATE `articles`
SET sum_votes = <1st_query_result>
WHERE `id` = 1;
1st way seems cleaner, but will the table be locked the whole time the SELECT query runs?
About the concurrency problems, you have an 'easy' way to prevent any concurrency problems in the 2nd method, inside your transaction perform a select on the articles line (the For update
is now implicit). Any concurrent insert on the same article will not be able to obtain this same lock and will wait for you.
With the new default isolation levels, without even using serialization level in the transaction you wouldn't see any concurrent insert on the vote table until the end of your transaction. So your SUM should stay coherent or looks like coherent. But if a concurrent transaction insert a vote on same article and commit before you (and this 2nd one does not see your insert), the last transaction to commit will overwrite the counter and you'll loose 1 vote. So perform a row lock on article by using a select before (and do your work in a transaction, of course). It's easy to test, open 2 interactive sessions on MySQL and start transactions with BEGIN.
If you use the trigger you are in a transaction by default. But I think you should perform as well the select on the article table to make an implicit row lock for concurrent triggers running (harder to test).
Last point: make harder transactions, before starting the transaction use:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
This way you do not need row locks on articles, MySQL will detect that a potential write on the same row occurs and will block the others transaction until you finish. But do not use something you have computed from a previous request. The update query will be waiting for a lock release on articles, when the lock is released by the 1st transaction COMMIT
the computing of SUM
should be done again to count. So the update query should contain the SUM
or make an addition.
update articles set nb_votes=(SELECT count(*) from vote) where id=2;
And here you'll see that MySQL is smart, a deadlock is detected if 2 transactions are trying to do this while insert has been done in a concurrent time. In serialization levels I haven't found a way to obtain a wrong value with :
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
insert into vote (...
update articles set nb_votes=(
SELECT count(*) from vote where article_id=xx
) where id=XX;
COMMIT;
But be ready to handle breaking transaction that you must redo.