I have a databases table with ~50K rows in it, each row represents a job that need to be done. I have a program that extracts a job from the DB, does the job and puts the result back in the db. (this system is running right now)
Now I want to allow more than one processing task to do jobs but be sure that no task is done twice (as a performance concern not that this will cause other problems). Because the access is by way of a stored procedure, my current though is to replace said stored procedure with something that looks something like this
update tbl
set owner = connection_id()
where available and owner is null limit 1;
select stuff
from tbl
where owner = connection_id();
BTW; worker's tasks might drop there connection between getting a job and submitting the results. Also, I don't expect the DB to even come close to being the bottle neck unless I mess that part up (~5 jobs per minute)
Are there any issues with this? Is there a better way to do this?
Note: the "Database as an IPC anti-pattern" is only slightly apropos here because
The best way to implement a job queue in a relational database system is to use SKIP LOCKED
.
SKIP LOCKED
is a lock acquisition option that applies to both read/share (FOR SHARE
) or write/exclusive (FOR UPDATE
) locks and is widely supported nowadays:
Now, consider we have the following post
table:
The status
column is used as an Enum
, having the values of:
PENDING
(0),APPROVED
(1),SPAM
(2).If we have multiple concurrent users trying to moderate the post
records, we need a way to coordinate their efforts to avoid having two moderators review the same post
row.
So, SKIP LOCKED
is exactly what we need. If two concurrent users, Alice and Bob, execute the following SELECT queries which lock the post records exclusively while also adding the SKIP LOCKED
option:
[Alice]:
SELECT
p.id AS id1_0_,1
p.body AS body2_0_,
p.status AS status3_0_,
p.title AS title4_0_
FROM
post p
WHERE
p.status = 0
ORDER BY
p.id
LIMIT 2
FOR UPDATE OF p SKIP LOCKED
[Bob]:
SELECT
p.id AS id1_0_,
p.body AS body2_0_,
p.status AS status3_0_,
p.title AS title4_0_
FROM
post p
WHERE
p.status = 0
ORDER BY
p.id
LIMIT 2
FOR UPDATE OF p SKIP LOCKED
We can see that Alice can select the first two entries while Bob selects the next 2 records. Without SKIP LOCKED
, Bob lock acquisition request would block until Alice releases the lock on the first 2 records.