I think I should know this somehow, especially after reading a lot of questions and answers regarding "The condition must go into the ON clause, not in the WHERE clause". However, I am still lost.
I have three tables, and I join them normally with LEFT (OUTER) joins. The joined tables looks like this (retty standard):
task_id task_questions_taskId taskQuestions_questionId question_id 1 1 5 5 1 1 8 8 2 2 8 8
SELECT `t`.`id` AS `task_id` ,
`task_questions`.`taskId` AS `task_questions_taskId` ,
`task_questions`.`questionId` AS `task_questions_questionId` ,
questions.id AS question_id
FROM `task` `t`
LEFT OUTER JOIN `task_questions` `task_questions`
ON ( `task_questions`.`taskId` = `t`.`id` )
LEFT OUTER JOIN `question` `questions`
ON ( `task_questions`.`questionId` = `questions`.`id` )
This is the standard query to get all the records. (It's taken from Yii; I actually want to to this with Active Record, but can't even get plain SQL right).
And now I want to get ONLY those tasks that have the question_id 2 AND 8 (e.g) So if a task has not both of those question.ids, I don't want it in the result set. In this case, the task could have other question_ids, too. Although it would be interesting to see how the query would look if it should return only those that have exactly those 2 (or any other set). It's easy to get all the tasks that have one question, with WHERE question.id = 2, but an AND in the WHERE clause leads to an empty result.
The WHERE clause can only apply conditions to one row at a time. But your questions of different id occur on different rows. How to solve this? Join both rows onto one row using a self-join.
Here's an example:
SELECT t.`id` AS `task_id`, ...
FROM `task` AS t
INNER JOIN `task_questions` AS tq2 ON ( tq2.`taskId` = t.`id` )
INNER JOIN `questions` AS q2 ON ( tq2.`questionId` = q2.`id` )
INNER JOIN `task_questions` AS tq8 ON ( tq8.`taskId` = t.`id` )
INNER JOIN `questions` AS q8 ON ( tq8.`questionId` = q8.`id` )
WHERE q2.`id` = 2 AND q8.`id` = 8
Another solution is to find the tasks that have questions 2 OR 8, and then use GROUP BY and HAVING to filter by groups that have exactly two of those.
SELECT t.`id` AS `task_id`, ...
FROM `task` AS t
INNER JOIN `task_questions` AS tq ON ( tq.`taskId` = t.`id` )
INNER JOIN `questions` AS q ON ( tq.`questionId` = q.`id` )
WHERE tq.`questionId` IN (2, 8)
GROUP BY t.`id`
HAVING COUNT(DISTINCT q.`id`) = 2