Should FILTER be used inside or outside of SUMMARIZE?

dax
whytheq picture whytheq · Jan 21, 2018 · Viewed 19.6k times · Source

I have these two queries:

EVALUATE
FILTER (
    SUMMARIZE (
        'Sales',
        Products[ProductName],
        'Calendar'[CalendarYear],
        "Total Sales Amount", SUM ( Sales[SalesAmount] ),
        "Total Cost", SUM ( 'Sales'[TotalProductCost] )
    ),
    Products[ProductName] = "AWC Logo Cap"
)
ORDER BY
    Products[ProductName],
    'Calendar'[CalendarYear] ASC

and this:

EVALUATE
SUMMARIZE (
    FILTER ( 'Sales', RELATED ( Products[ProductName] ) = "AWC Logo Cap" ),
    Products[ProductName],
    'Calendar'[CalendarYear],
    "Total Sales Amount", SUM ( Sales[SalesAmount] ),
    "Total Cost", SUM ( 'Sales'[TotalProductCost] )
)
ORDER BY
    Products[ProductName],
    'Calendar'[CalendarYear] ASC

Both return the following:

enter image description here

The only difference between the two queries is the positioning of the FILTER function - which is better practice and why?


note

So looking at the two sqlbi articles referenced by Alex we can do either of the following to potentially make things more performant but I'm still unsure if the FILTER function should happen inside or outside the other syntax:

EVALUATE
FILTER (
    ADDCOLUMNS (
        SUMMARIZE ( 'Sales', Products[ProductName], 'Calendar'[CalendarYear] ),
        "Total Sales Amount", CALCULATE ( SUM ( Sales[SalesAmount] ) ),
        "Total Cost", CALCULATE ( SUM ( 'Sales'[TotalProductCost] ) )
    ),
    Products[ProductName] = "AWC Logo Cap"
)
ORDER BY
    Products[ProductName],
    'Calendar'[CalendarYear] ASC

And using the 'SUMMARIZECOLUMNS' function:

EVALUATE
FILTER (
    SUMMARIZECOLUMNS (
        Products[ProductName],
        'Calendar'[CalendarYear],
        "Total Sales Amount", SUM ( Sales[SalesAmount] ),
        "Total Cost", SUM ( 'Sales'[TotalProductCost] )
    ),
    Products[ProductName] = "AWC Logo Cap"
)
ORDER BY
    Products[ProductName],
    'Calendar'[CalendarYear] ASC

note2

Looks like SUMMARIZECOLUMNS has a built in FILTER parameter so I'd guess that this is the best way to go to guard against performance issues:

EVALUATE
SUMMARIZECOLUMNS (
    Products[ProductName],
    'Calendar'[CalendarYear],
    FILTER ( 'Products', Products[ProductName] = "AWC Logo Cap" ),
    "Total Sales Amount", SUM ( Sales[SalesAmount] ),
    "Total Cost", SUM ( 'Sales'[TotalProductCost] )
)
ORDER BY
    Products[ProductName],
    'Calendar'[CalendarYear] ASC

Answer

Alexis Olson picture Alexis Olson · Jan 22, 2018

Of the two options you gave, I suspect the latter may be more efficient computationally. However, neither is likely "best practice".

According to Best Practices Using SUMMARIZE and ADDCOLUMNS on sqlbi.com,

you should always favor the ADDCOLUMNS version. The rule of thumb is that you should never add extended columns by using SUMMARIZE, unless it is required due to at least one of the following conditions:

  • You want to use ROLLUP over one or more grouping columns in order to obtain subtotals

  • You are using non-trivial table expressions in the extended column, as you will see in the “Filter Context in SUMMARIZE and ADDCOLUMNS” section later in this article

Please also check their article on SUMMARIZECOLUMNS, which recommends the newer function in most use cases.