Understanding OLAP – Tomislav Piasevoli

The reason I’m discussing this often mentioned topic is the following. After many hours spent on the analysis how do dynamic sets behave in SSAS front-ends, I’ve been asked a perfectly reasonable question: Why do we have dynamic set and when should it be used? I believe this article explains the story behind dynamic sets and their purpose, as well as when and how they should be used. It also warns the reader that the problem is not about the implementation of dynamic sets in SSAS, but about something else. What? You’ll see when you read it through.

INTRODUCING DYNAMIC SETS

SSAS 2005 and 2008 feature MDX script as a place to provide calculations. The reason is obvious – concentration of calculations on a server results in better query performance (precomputations, cache, etc.) and easier maintenance.

SSAS 2005 MDX script featured cube-based calculated members/sets support. While cube-based calculated members were equivalent in behavior to their query-based counterparts (except for some newly available internal optimization possibilities), cube-bases calculated sets were not. They were useful, but many times that was not enough. They didn’t behave exactly the same way query-based sets did and, for many, this was a big issue, because they couldn’t provide the same logic in cube as in query. In order to comply to requests, SSAS team introduced new kind of sets in its next version of SSAS – dynamic sets.

Dynamic sets, unlike static sets, are fully equivalent to calculated sets defined in a query. Which means BI developers were finally able to achieve the same functionality in MDX script as once in query.

It’s important to notice that sets in SSAS 2005 have not been converted to dynamic, once implemented in SSAS 2008. Today, both versions of sets exist, both are useful. One variant was named STATIC, the other DYNAMIC. And it’s solely up to a BI developer to prefix them appropriately in their definition. Default are static sets (for historic reasons, 2005->2008 compatibility), which means prefix can be ommited.

DESIGN AND USAGE

The rule for usage was simple. If a BI developer needs a hardcoded collection of members, which doesn’t change no matter what in query/pivot, he/she should use static sets. In all other scenarios, a dynamic set can be used. So, how come there are problems here and there using them?

We said that query-based sets should have their cube-based equivalent. And they do, at least in SSAS 2008. When we implement cube-based sets, we enjoy the comfort of being able to refer to them in any query. Equally so, end-users are able to use them in a pivot table (SSAS front-ends) which is an extra plus. That is unless we hide them, which rarely is the case. Part of our intention is to provide those sets directly to end-users, not just to be able to refer to them in our queries.

Dynamic sets are context-sensitive. Not row, not column, just slicer (and subselect) sensitive. Which means that end-users can experiment by slicing them in a pivot table and see how they change. A typical example is “Top 10 something” set. When defined as dynamic in a cube, it should cover a whole range of scenarios for end-users. Whether sliced by additional attributes or different dimensions, that set should adjust accordingly. A dream come through for any BI developer – a cube-based solution covering many reports. Because the best solutions are those covered by design. Just like a pivot table on top of an OLAP cube, a solution-by-design, covering endless T-SQL queries in a relational world. However, not everything wen’t smoothly, at least not on the other side …

SUPPORTING DYNAMIC SETS

Queries are written by BI developers, not end-users. Which means they and their sets are under tight control, hidden and preferably tested enough. Pivot-table in front-ends are not. In contrast, they are fully exposed to end-users and therefore often limited, not just regarding cube-based sets, but other features as well. Why? It’s simple, the more you restrict your front-end, the less worries you have later, knowing that potential end-user actions and their combinations are beyond your imagination. One of the common example is hiding MDX – you play it safe with your pivot and features you support in it.

Each front-end tries to keep things under control by balancing between end-users requests and features it would rather skip than support. In a multidimensional world possibilities are endless and therefore – once you had decided to support a SSAS feature, you’ve opened a Pandora’s box whether you know it or not, because everything’s related and all features should fit together just fine.

Every front-end has these dilemmas – do we support this feature, if so in what scope, in what situations, what do we support as additional actions to it, what combinations of existing features are supported with it, etc. (our CubePlayer is no exception there). It’s hard to implement everything properly, to foreseen things. I’m not saying impossible, but, let say experience plays a big role here.

Things went wrong when little by little it became obvious that the favorite front-end of many (also a relatively young one, so we shouldn’t be too harsh on it) had a problem with dynamic sets. Cause of the problem? Intense use of subselect.

When an end-user placed a set on rows (or columns), a peculiar syntax was formed and that same set also ended in subselect.

By now you’ve must have guessed – we’re talking about Excel 2007.

HOUSTON, WE’VE HAD A PROBLEM

Each front-end generates one or more MDX statements in the backgroud for every end-user action in its pivot (almost every action). Regardless of areas and their names in the front-end’s pivot, which is up to a particular software vendor, there are rows, columns, slicer and subselect areas as 4 basic parts of every MDX query (some of them can be skipped, though).

MDX query is the thing that is being sent to SSAS server, so it up to the front-end to properly convert what the end user provided in pivot into a correct MDX. The problem is, there can be several solutions to it. It’s all about how to pick the best solution, the one that will work not just in this case, but also with everything we support in our front-end (which brings us back to front-end dilemma). Using the subselect too often as an approach was not a good decision, because dynamic sets, if nothing else, were obviously negleckted. Or not tested properly. The artificial Adventure Works database, a reference for many things, may be misleading (see this SSAS forum thread for more info).

Two of those areas, subselect and slicer, are of importance here. They are similar in nature but also quite different. In order not to loose focus here, we will only state what’s relevant for dynamic sets – the subselect is evaluated before the slicer!

SSAS server can be simplified like this. Internally, first it ask what do we have in subselect and then evaluates it. It is important to state here that this evaluation is performed in the context of cube dimensions default members (or another subselect inside it, if they are cascaded, but let’s keep things simple). Then the slicer gets evaluated. Then the axes. Then cells, either by block-mode, or cell-by-cell. Finally, NON EMPTY and HAVING operations are performed on axes.

Our set is supposed to be on an axis, where the end-user has placed it. That way it would be evaluated after the slicer, adjusting accordingly. Excel 2007 on the other hand placed it in the subselect. It can’t be evaluated against the slicer because the slicer doesn’t exist yet (it’s like knowing the future). But what does exist?

Default members! Dynamic set placed in subselect is evaluated against cube hierarchies default members. And what is static set evaluated against, in MDX script? Also default members. Which makes dynamic and static sets the same in Excel 2007, both static in nature.

CONCLUSION

Dynamic sets are implemented just fine in SSAS 2008. That is, they are equal to query-based sets, adjusting to slicer promptly.

It is up to the front-end you’re going to use on them and its support towards them. That’s where the problems might occur, either because of lack of support, or worse, because of improper support.

Finally, it’s up to us, consultants, to educate ourselves properly and then communicate with potential users, advising them what is supported and what is not in solutions we intend to implement. The final decision, of course, is client’s.