How to JOIN summarized data from two queries into new table in DAX Power BI

Serdia picture Serdia · Jul 20, 2018 · Viewed 12.2k times · Source

I have 2 queries:

Premium:

enter image description here

and Losses:

enter image description here

How can I simply summarize data from Premium query and LEFT JOIN it to summarized data in Losses query using DAX?

In SQL it would be like that:

declare @PremiumTable table (PolicyNumber varchar(50), Premium money)
insert into @PremiumTable values 
                                ('Pol1', 100),
                                ('Pol1', 50),
                                ('Pol2', 300),
                                ('Pol3', 500),
                                ('Pol3', 200),
                                ('Pol4',400)

declare @LossesTable table (PolicyNumber varchar(50), Losses money)
insert into @LossesTable values ('Pol1',115),
                                ('Pol1',25),
                                ('Pol2',0),
                                ('Pol3',110),
                                ('Pol3',75)


select  p.PolicyNumber, 
        sum(p.Premium) as Premium,
        sum(l.Losses)as Losses  
from @PremiumTable p 
        LEFT JOIN @LossesTable l on p.PolicyNumber = l.PolicyNumber
group by p.PolicyNumber

Result:

enter image description here

I tried using NATURALLEFTOUTERJOIN but it gives me an error:

*An incompatible join column, (''[PolicyNumber]) was detected. 'NATURALLEFTOUTERJOIN' doesn't support joins by using columns with different data types or lineage.*

MyTable = 
    VAR Premium = 
            SELECTCOLUMNS(
                fact_Premium,
                "PolicyNumber",fact_Premium[PolicyNumber],
                "Premium", fact_Premium[Premium]
                )
    VAR Losses = 
                SELECTCOLUMNS(
                    fact_Losses,
                    "PolicyNumber", fact_Losses[PolicyNumber],
                    "Losses", fact_Losses[PaymentAmount]
                             )
    VAR Result = NATURALLEFTOUTERJOIN(Premium,Losses)
    RETURN Result

Answer

Foxan Ng picture Foxan Ng · Jul 20, 2018

There are a few interdependent "bugs" or limitations around the use of variables (VAR) and NATURALLEFTOUTERJOIN which makes this a weird case to debug.

Some notable limitations are:

VAR:

Columns in table variables cannot be referenced via TableName[ColumnName] syntax.

NATURALLEFTOUTERJOIN:

Either:

The relationship between both tables has to be defined before the join is applied AND the names of the columns that define the relationship need to be different.

Or:

In order to join two columns with the same name and no relationships, it is necessary that these columns to have a data lineage.

(I'm a bit confused because the link mentioned do not have a data lineage; while official documentation said only columns from the same source table (have the same lineage) are joined on.)


Come back to this case.

  1. SUMMARIZE should be used instead of SELECTCOLUMNS to obtain summary tables for Premium and Losses, i.e.:

    Premium = 
    SUMMARIZE(
        fact_Premium,
        fact_Premium[PolicyNumber],
        "Premium", SUM(fact_Premium[Premium])
    )
    
    Losses = 
    SUMMARIZE(
        fact_Losses,
        fact_Losses[PolicyNumber],
        "Losses", SUM(fact_Losses[Losses])
    )
    
  2. When we apply NATURALLEFTOUTERJOIN to the above two tables, it'll return error No common join columns detected because of they have no relationship established.

enter image description here

  1. To resolve this, we can make use of TREATAS as suggested in this blog post. But to use TREATAS, we have to reference the column names in Premium and Losses table, so we can't use VAR to declare them, but have to actually instantiate them.

To conclude, the solution would be:

  1. Create calculate tables for Premium and Losses as mentioned above.

premium

losses

  1. Use TREATAS to mimic a data lineage and join Premium table with Losses_TreatAs instead.

    MyTable = 
    VAR Losses_TreatAs = TREATAS(Losses, Premium[PolicyNumber], Losses[Losses])
    RETURN NATURALLEFTOUTERJOIN(Premium, Losses_TreatAs)
    

Results:

results