I've inherited a database that wasn't designed exactly optimally, and I need to manipulate some data. Let me give a more common analogy of the kind of thing I have to do:
Let's say we have a Student
table, a StudentClass
table keeping record of all the classes he attended, and a StudentTeacher
table that stores all the teachers who taught this student. Yes, I know it's a dumb design and it would make more sense to store the teacher on the Class table - but that's what we're working with.
I now want to clean up the data, and I want to find all the places where a student has a teacher but no classes, or a class but no teachers. SQL thus:
select *
from StudentClass sc
full outer join StudentTeacher st on st.StudentID = sc.StudentID
where st.id is null or sc.id is null
How do you do that in Linq?
I think I have the answer here, which is not as elegant as I'd hoped, but it should do the trick:
var studentIDs = StudentClasses.Select(sc => sc.StudentID)
.Union(StudentTeachers.Select(st => st.StudentID);
//.Distinct(); -- Distinct not necessary after Union
var q =
from id in studentIDs
join sc in StudentClasses on id equals sc.StudentID into jsc
from sc in jsc.DefaultIfEmpty()
join st in StudentTeachers on id equals st.StudentID into jst
from st in jst.DefaultIfEmpty()
where st == null ^ sc == null
select new { sc, st };
You could probably squeeze these two statements into one, but I think you'd sacrifice code clarity.