I have 2 queries:
Premium:
and Losses:
How can I simply summarize data from Premium query and LEFT JOIN it to summarized data in Losses query using DAX?
In SQL it would be like that:
declare @PremiumTable table (PolicyNumber varchar(50), Premium money)
insert into @PremiumTable values
('Pol1', 100),
('Pol1', 50),
('Pol2', 300),
('Pol3', 500),
('Pol3', 200),
('Pol4',400)
declare @LossesTable table (PolicyNumber varchar(50), Losses money)
insert into @LossesTable values ('Pol1',115),
('Pol1',25),
('Pol2',0),
('Pol3',110),
('Pol3',75)
select p.PolicyNumber,
sum(p.Premium) as Premium,
sum(l.Losses)as Losses
from @PremiumTable p
LEFT JOIN @LossesTable l on p.PolicyNumber = l.PolicyNumber
group by p.PolicyNumber
Result:
I tried using NATURALLEFTOUTERJOIN
but it gives me an error:
*An incompatible join column, (''[PolicyNumber]) was detected. 'NATURALLEFTOUTERJOIN' doesn't support joins by using columns with different data types or lineage.*
MyTable =
VAR Premium =
SELECTCOLUMNS(
fact_Premium,
"PolicyNumber",fact_Premium[PolicyNumber],
"Premium", fact_Premium[Premium]
)
VAR Losses =
SELECTCOLUMNS(
fact_Losses,
"PolicyNumber", fact_Losses[PolicyNumber],
"Losses", fact_Losses[PaymentAmount]
)
VAR Result = NATURALLEFTOUTERJOIN(Premium,Losses)
RETURN Result
There are a few interdependent "bugs" or limitations around the use of variables (VAR
) and NATURALLEFTOUTERJOIN
which makes this a weird case to debug.
Some notable limitations are:
VAR
:
Columns in table variables cannot be referenced via TableName[ColumnName] syntax.
NATURALLEFTOUTERJOIN
:
Either:
The relationship between both tables has to be defined before the join is applied AND the names of the columns that define the relationship need to be different.
Or:
In order to join two columns with the same name and no relationships, it is necessary that these columns to have a data lineage.
(I'm a bit confused because the link mentioned do not have a data lineage
; while official documentation said only columns from the same source table (have the same lineage) are joined on.)
Come back to this case.
SUMMARIZE
should be used instead of SELECTCOLUMNS
to obtain summary tables for Premium
and Losses
, i.e.:
Premium =
SUMMARIZE(
fact_Premium,
fact_Premium[PolicyNumber],
"Premium", SUM(fact_Premium[Premium])
)
Losses =
SUMMARIZE(
fact_Losses,
fact_Losses[PolicyNumber],
"Losses", SUM(fact_Losses[Losses])
)
When we apply NATURALLEFTOUTERJOIN
to the above two tables, it'll return error No common join columns detected
because of they have no relationship established.
TREATAS
as suggested in this blog post. But to use TREATAS
, we have to reference the column names in Premium
and Losses
table, so we can't use VAR
to declare them, but have to actually instantiate them.To conclude, the solution would be:
Premium
and Losses
as mentioned above.Use TREATAS
to mimic a data lineage and join Premium
table with Losses_TreatAs
instead.
MyTable =
VAR Losses_TreatAs = TREATAS(Losses, Premium[PolicyNumber], Losses[Losses])
RETURN NATURALLEFTOUTERJOIN(Premium, Losses_TreatAs)
Results: