Join: three tables and a or condition

Narretz picture Narretz · Aug 23, 2011 · Viewed 8.3k times · Source

I think I should know this somehow, especially after reading a lot of questions and answers regarding "The condition must go into the ON clause, not in the WHERE clause". However, I am still lost.

I have three tables, and I join them normally with LEFT (OUTER) joins. The joined tables looks like this (retty standard):


task_id task_questions_taskId   taskQuestions_questionId    question_id
1         1                     5                           5
1         1                     8                           8
2         2                     8                           8

SELECT `t`.`id` AS `task_id` , 
       `task_questions`.`taskId` AS `task_questions_taskId` ,
       `task_questions`.`questionId` AS `task_questions_questionId` , 
       questions.id AS question_id
FROM `task` `t`
LEFT OUTER JOIN `task_questions` `task_questions` 
    ON ( `task_questions`.`taskId` = `t`.`id` )
LEFT OUTER JOIN `question` `questions` 
    ON ( `task_questions`.`questionId` = `questions`.`id` )

This is the standard query to get all the records. (It's taken from Yii; I actually want to to this with Active Record, but can't even get plain SQL right).

And now I want to get ONLY those tasks that have the question_id 2 AND 8 (e.g) So if a task has not both of those question.ids, I don't want it in the result set. In this case, the task could have other question_ids, too. Although it would be interesting to see how the query would look if it should return only those that have exactly those 2 (or any other set). It's easy to get all the tasks that have one question, with WHERE question.id = 2, but an AND in the WHERE clause leads to an empty result.

Answer

Bill Karwin picture Bill Karwin · Aug 24, 2011

The WHERE clause can only apply conditions to one row at a time. But your questions of different id occur on different rows. How to solve this? Join both rows onto one row using a self-join.

Here's an example:

SELECT t.`id` AS `task_id`, ...
FROM `task` AS t
INNER JOIN `task_questions` AS tq2 ON ( tq2.`taskId` = t.`id` )
INNER JOIN `questions` AS q2 ON ( tq2.`questionId` = q2.`id` )
INNER JOIN `task_questions` AS tq8 ON ( tq8.`taskId` = t.`id` )
INNER JOIN `questions` AS q8 ON ( tq8.`questionId` = q8.`id` )
WHERE q2.`id` = 2 AND q8.`id` = 8

Another solution is to find the tasks that have questions 2 OR 8, and then use GROUP BY and HAVING to filter by groups that have exactly two of those.

SELECT t.`id` AS `task_id`, ...
FROM `task` AS t
INNER JOIN `task_questions` AS tq ON ( tq.`taskId` = t.`id` )
INNER JOIN `questions` AS q ON ( tq.`questionId` = q.`id` )
WHERE tq.`questionId` IN (2, 8)
GROUP BY t.`id`
HAVING COUNT(DISTINCT q.`id`) = 2