Group by Multiple Columns and Count

Karthikeyan K picture Karthikeyan K · Jul 9, 2010 · Viewed 16.4k times · Source

I have Table1 and Table2 in the form of IEnumerable<DataRow>. Both the tables have columns Column1 and Column2.

I would like to do a left outer join on Column1 and would like to get a count of the rows present in Table2 and load the records into a DataTable.

I tried the following query

var query = from p in Table1
            join q in Table2 on p.Field<string>("Column1") equals q.Field<string>("Column1") into pq
            from xyz in pq.DefaultIfEmpty()
            group xyz by new { Col1 = p.Field<string>("Column1"), Col2 = p.Field<string>("Column2") } into g
            select dtFinalData.LoadDataRow(new object[]
            {
                g.Key.Col1,
                g.Key.Col2,                               
                g.Count
            }, false);

Since the 'g' represents the grouped data the g.count returns 1 for rows which does not have entries in Table 2. I would like to return '0' for those rows.

Input :

Table 1

Col1Val1       Col2Val1

Col1Val2       Col2Val2

Table 2

Col1Val1       Col2Val1

Col1Val1       Col2Val1

Current Output :

Col1Val1        Col2Val1    2

Col2Val2        Col2Val2    1

Expected Results :

Col1Val1        Col2Val1    2

Col2Val2        Col2Val2    0

I have looked at LINQ - Left Join, Group By, and Count but I could not apply the same into my query ...

Can you help me fix this query ?

Answer

Amy B picture Amy B · Jul 9, 2010

let it be so:

from p in Table1
let p1 = p.Field<string>("Column1")
let p2 = p.Field<string>("Column2") 
let qs = 
  from q in Table2
  where p1 == q.Field<string>("Column1")
  select q
let qCount = qs.Count()
select dtFinalData.LoadDataRow(new object[]
{
  p1,
  p2,
  qCount
}, false);

Since I didn't join, I don't have to group. Each result row corresponds to a row in Table1.


Here's a GroupJoin solution:

from p in Table1 
let pkey = new { c1 = p.Field<string>("Column1"), c2 = p.Field<string>("Column2") }
join q in Table2 on pkey equals
  new { c1 = q.Field<string>("Column1"), c2 = q.Field<string>("Column2") }
  into qs
select dtFinalData.LoadDataRow(new object[] 
{ 
  pkey.c1, 
  pkey.c2, 
  qs.Count() 
}, false); 

And here's a Join and Group solution.

from p in Table1 
let pkey = new { c1 = p.Field<string>("Column1"), c2 = p.Field<string>("Column2") }
join q in Table2 on pkey equals
  new { c1 = q.Field<string>("Column1"), c2 = q.Field<string>("Column2") }
  into right
from q in right.DefaultIfEmpty()
group q by pkey into g
select dtFinalData.LoadDataRow(new object[] 
{ 
  g.Key.c1, 
  g.Key.c2, 
  g.Count(q => q != null) 
}, false);