A Box of Chocolate: Flavors (part one)

MDX Add comments

Oct 252009

“Life is like a box of chocolate: You never know what you’re gonna get.”

It is not the exact memorable quote from the Forrest Gump film, rather simply a form of it which I’ve memorized. Moreover, being excited about the package with three blue letters on it I’ve just received, it’s also the one I find the most appropriate for the series of articles I intend to write.

What’s the topic this time? Randomness in MDX.

Having thought about it for a while, I’ve recognized three variants of it: random values, random members and random objects. Each a story by itself. Therefore, I’ve decided to cover them separately, starting with chocolate flavors representing, well, you’ve guessed it by now – random values.

INTRODUCTION

Imagine a box full of chocolates. Each chocolate being filled with some delicious ingredient. How many different ingredients are there in total? What do they taste like? Which shape would you try first? You feel tempted. The hand moves. A mixture of tastes overwhelms your senses as you reach for them one by one, experiencing the a short glimpse of sensation as their sweetness melts in your mouth. You repeat the process and the next thing you know, they’re gone. All of them.

Now, if that made you take a walk around a place you’re at, be sure to come back soon and see how to simulate the same in MDX query.

RANDOM VALUES

There will be times when you’ll need unpredictable results. Possibly inside a range of values, yet random in nature. The solution is to create a calculated measure that uses VBA Rnd() function. Yet it’s not that easy as it looks like. Why? Because that function will provide repeating values in each row, that is, unless prepared properly. And here’s how.

WITH
-- initial settings
MEMBER [Measures].[Min Value] AS 10
MEMBER [Measures].[Max Value] AS 99
MEMBER [Measures].[Value Range] AS
       [Measures].[Max Value] - [Measures].[Min Value]

-- A) non-working approach, same number for all rows MEMBER [Measures].[Fixed Random Value] AS
Int( [Measures].[Min Value] +
            [Measures].[Value Range] * Rnd( ) ),
       NON_EMPTY_BEHAVIOR = '[Measures].[Sales Amount]'
-- B) working approach
MEMBER [Measures].[Random Value 1] AS
Int( [Measures].[Min Value] +
            [Measures].[Value Range] *
            Rnd( Rank( [Product].[Product Model Lines]                       .CurrentMember,
                       [Product].[Product Model Lines]                       .CurrentMember.Level.MEMBERS ) )
          ),
       NON_EMPTY_BEHAVIOR = '[Measures].[Sales Amount]'
-- C) working approach
MEMBER [Random number per row] AS
       Rnd( Rank( [Product].[Product Model Lines]                  .CurrentMember,
                  [Product].[Product Model Lines]                  .CurrentMember.Level.MEMBERS ) )
MEMBER [Measures].[Random Value 2] AS
Int( [Measures].[Min Value] +
            [Measures].[Value Range] * [Random number per row] ),
       NON_EMPTY_BEHAVIOR = '[Measures].[Sales Amount]'
SELECT
    {
        [Measures].[Sales Amount],
        [Measures].[Fixed Random Value],
        [Measures].[Random Value 1],
        [Measures].[Random Value 2]
    }
    ON 0,
    NON EMPTY
    {
        [Product].[Product Model Lines].[Model].MEMBERS
    }
    ON 1
FROM
    [Adventure Works]

The first calculated measure (Fixed Random Value) will not generate random values, while the second two measures will (Random Value 1 and 2). The reason is that SSAS engine tends to determine what expressions can be considered deterministic and recognizes Fixed Random Value measure’s definition as one of them. In order to prevent that, we need to insert non-deterministic component as part of the expression, either by inserting it directly (Random Value 1) or by referring to a helper measure that contains it (Random Value 2).

Non-determinism is usually obtained by referring to current member or similar. Generally, that (unnecessary non-determinism) is something one should avoid if possible, because it slows down the query in most cases (causing additional loops and preventing block mode). However, here it was a requirement in order to achieve dynamic results. The Rank() function we used here is probably not the only way to achieve this.

Screenshot – Random values

Min and Max values are used in order to show how to make offset – the range doesn’t have to start from zero, as in this example.

Non_Empty_Behavior is used to eliminate rows that have no sales (notice NON EMPTY on rows). Without it, we would have rows with null value in sales regardless of NON EMPTY, because the random values wouldn’t be null. Hence we had to take care of it. Alternatively, that can be done using iif() inside calculations, or when you need to provide more complex condition (i.e. expression instead of a measure).

CONCLUSION

It is possible to have random values inside a calculated measure. The definition of that measure should include VBA Rnd() function and should also be non-deterministic in nature. In short, that means we inject a part of calculation that generates different values per each member/row on axis. In the above example that was achieved using Rank() function, because rank is different for each member.

The other thing we should take care of is providing a null value when it is required. For example, if some other measure in query is null, so shall our calculation return, to prevent additional rows from appearing when NON EMPTY is used on axes.

NEXT STEPS

That brings us to the end of part one. In the next part, we’ll cover samples of hierarchy, something which Mosha already covered here, and which is included in ASSP project on CodePlex. Nevertheless, I’ll try to provide additional examples of usability.

Finally, here’s a link for the same topic, yet in T-SQL, I managed to discover while performing search for related articles.

Tomislav Piasevoli – BI Blog

A Box of Chocolate: Flavors (part one)

Leave a Reply Cancel reply