MySQL NOT IN with subquery not working as expected

BVBAccelerate picture BVBAccelerate · Aug 2, 2012 · Viewed 8.3k times · Source

I'm creating a application that will generate lists for email marketing campaigns. I have tables for contacts, emails, and campaigns. A campaign has many emails and a contact has many emails. The email is related to a contact and a campaign. Basically a table for a MANY to MANY relationship except I have other fields in the table for the result of the email (clicked, opened, unsubscribed, etc). There are also other tables but this is where I'm having the trouble.

I'm trying to use NOT IN with a subquery to get a list of contacts who have not received an email since a certain date with other conditions. An example query is this:

SELECT * 
FROM `contact` `t` 
WHERE (unsubscribed='1')
  AND t.id NOT IN 
   (SELECT distinct contact_id 
    FROM email, campaign 
    WHERE email.campaign_id = campaign.id 
      AND campaign.date_sent >= '2012-07-12') 
ORDER BY rand() 
LIMIT 10000

This returns 0 result. However, if I run the first condition:

select id 
from contact 
where unsubscribed=1

I have 9075 rows. Then, if I separately run the subquery:

SELECT distinct contact_id 
FROM email, campaign 
WHERE email.campaign_id = campaign.id 
  AND campaign.date_sent >= '2012-07-12'

I have 116612 rows. Out of each of those results, I end up with 826 values that are duplicates. From what I can understand, this means that 9075-826=8249 records ARE unsubscribed=1 AND NOT IN the second query. So, my first query should be returning 8249 results but it is returning 0. I must be structuring the query wrong or using the wrong operators but I can not for the life of me figure out how to get this right.

Can anyone help? So many thanks in advance as this has had me stumped for like 3 days! :)

Answer

Sebas picture Sebas · Aug 2, 2012

This is because

SELECT 1 FROM DUAL WHERE 1 NOT IN (NULL, 2) 

won't return anything, whereas

SELECT 1 FROM DUAL WHERE 1 NOT IN (2)

will.

Please review the behaviour of NOT IN and NULL in MYSQL.

For your concern you should get away with it using NOT EXISTS instead of NOT IN:

SELECT * FROM `contact` `t` 
WHERE (unsubscribed='1')
AND NOT EXISTS (
    SELECT * FROM email, campaign 
    WHERE 
        email.campaign_id = campaign.id 
    AND campaign.date_sent >= '2012-07-12'
    AND t.id = contact_id
) 
ORDER BY rand() 
LIMIT 10000