Self join with inner and outer join query

Ban Atman picture Ban Atman · Mar 7, 2013 · Viewed 8.2k times · Source

I have a table that is set up so that one column (attribute) contains information like first name, last name, account number and any other information related to a thing in the database. Another column (attributeType) contains a number indicating what the attribute is e.g. 1 might be first name, 2 last name and 3 account number etc. There is another column (enddate) indicating if the record is current by having a date there. Usually it will be set to the year 9999 when current and some date in the past otherwise. All data describing the same thing has a unique value too in another column (entity) so that each record with the same number in the entity column will be describing the one person. E.g.

entity  attribute  attributetype  enddate
------  ---------  -------------  --------
1       ben        1              9999-1-1
1       alt        2              9999-1-1
1       12345      3              9999-1-1
2       sam        1              9999-1-1
2       smith      2              9999-1-1
2       98765      3              1981-1-1

I want to select a person from the above table with a specific 1st and last name where the name will be current but not output the account number if it is not. Assuming the table is called tblAccount I do the following for the name part:

select ta1.attribute '1st Name', ta2.attribute 'last name'
from tblAccount ta1
inner join tblAccount ta2 on ta1.entity = ta2.entity
where ta1.attribute = 'sam' and ta2.attribute = 'smith'
      and ta1.attributetype = 1 and ta2. attributetype = 2
      and ta1.enddate > getdate() and ta2.enddate > getdate()

and it outputs the first and last names as expected, but when I want to include the account number column I get nothing output:

select ta1.attribute '1st Name', ta2.attribute 'last name', ta3.attribute 'account#'
from tblAccount ta1
inner join tblAccount ta2 on ta1.entity = ta2.entity
left join tblAccount ta3 on ta1.entity = ta3.entity
where ta1.attribute = 'sam' and ta2.attribute = 'smith'
      and ta1.attributetype = 1 and ta2. attributetype = 2
      and ta1.enddate > getdate() and ta2.enddate > getdate()
      and ta3.attributetype = 3 and ta3.enddate > getdate()

What I would like to see is the first and last names output with nothing in the account# column in the above case where it is not current. What am I doing wrong and how can I correct this query?

Answer

JAQFrost picture JAQFrost · Mar 7, 2013

You have to move the date comparison to the join condition:

select ta1.attribute '1st Name'
    , ta2.attribute 'last name'
    , ta3.attribute 'account#'
from tblAccount ta1
inner join tblAccount ta2 
    on ta1.entity = ta2.entity
     and ta1.attributetype = 1 and ta2. attributetype = 2
     and ta1.enddate > getdate() and ta2.enddate > getdate()
left join tblAccount ta3 on ta1.entity = ta3.entity
      and ta3.attributetype = 3 and ta3.enddate > getdate()
where ta1.attribute = 'sam' and ta2.attribute = 'smith'

When it's in the where clause it's comparing getdate() to NULL if there is no account, which returns NULL. So no record.

EDIT:

In response to the valid concern about multiple valid records, and to make the code a little more maintenance freindly:

DECLARE @FNAME VARCHAR(50) = 'sam'
    , @LNAME VARCHAR(50) = 'smith'
    , @now DATETIME2(7) = GETDATE();

SELECT 
    name.[1st Name]
    , name.[last name]
    , name.entity
    , 
        (
            select 
                top 1 
                ta3.attribute
            FROM tblAccount ta3 
            WHERE 
                ta3.entity = name.entity
                and 
                ta3.attributetype = 3 
                and 
                ta3.enddate > @now
            ORDER BY 
                ta3.enddate 
        )
FROM 
    (        
        select 
            ta1.attribute '1st Name'
            , ta2.attribute 'last name'
            , ta.entity
            , ROW_NUMBER()
                OVER(
                    PARTITION BY 
                        ta1.entity
                    ORDER BY 
                        ta1.enddate
                    ) r
        from 
            tblAccount ta1
        inner join tblAccount ta2 
            on 
            ta1.entity = ta2.entity
            and 
            ta2. attributetype = 2
            and 
            ta2.enddate > @now
            and 
            ta2.attribute = @LNAME
        where 
            ta1.attributetype = 1 
            and 
            ta1.attribute = @fname 
            and 
            ta1.enddate > @now
    ) name
WHERE    
    NAME.r = 1

;

This code works around the implied assumptions of one entity per first/last name and exactly one enddate after the execution time. The variables are a little more stored proc friendly, and allow you to change the "as of" date. And if you're stuck with EAV, you will likely want stored procs. I'm taking the first record ending after the date in question, on the assumption that any later record(s) should only be valid after that one expires. Maybe it's overkill, since it's beyond the scope of the OP's question but it's a valid point.

I say "stuck with EAV". While EAV isn't always bad; neither is shooting someone in the back. In either case, you'd better have solid justification if you expect to get it past a jury. In a NoSQL storage pattern it's fine, but EAV is usually a poor implementation pattern for the RDBMS paradigm.

Though from the OP's later comment, it looks like he's hit on one of the better reasons.