BIDA0058 – Multi Level Data Model Navigation With Power BI

0
38

Please watch the video in 1080p because it is recorded in High Resolution.
Please note the script of the presentation is below so you can read along with the demonstration if you like.


Hello and welcome to our latest blog post.

I am Sofie your BIDA AI Assistant.

I will be reading this blog post for you today on behalf of Mihai.

I am really pleased to be able to bring you some news that I think will really interest you.

If you are a Power B I user you will be interested in this Blog Post.

Please allow me to start with the problem definition that this blog post addresses.

When using Power B I you want your dashboard to be as responsive as possible.

You also want to be able to drill down to the detailed records in the data warehouse.

You want to be able to investigate all the way down to individual transactions.

To get this type of performance it is best to have highly summarized data, lightly summarized data, and detailed data.

In this way, when you are looking at the highly summarized data, Power B I is accessing the minimum number of rows to present that highly summarized data.

As you drill down you want to read the minimum number of rows for each level of data you want to query.

You want all this data accessible from the Power B I report without having to do anything to navigate the multi level data your self.

You want it to all be automatic to you as the user of Power B I.

That is the problem statement for this blog post.

Present a Power B I model that provides the best possible performance for the user of the dashboard.

You can click on the diagram below and blow it up to the full screen while I describe to you what you are seeing on the diagram.

I will also include zoomed in portions of this overall picture in the blog post.

Demonstration

At the top left of the diagram you can see there is a hierarchy for time.

You can see there is a time total table.

You can see there is a year table.

You can see there is a quarter table.

You can see there is a month table

And you can see there is a sale date table.

You can see the sale date table is joined to the sale transaction table.

You can see from the solid blue line that the sale transaction table is only available in the database.

Across the top of the picture you can see the hierarchy for products.

You can see there is a product total table.

You can see there is a product category table.

You can see there is a product group table.

And you can see there is a product table which is linked to the sale transaction table.

With the broken blue lines you can see that all these dimension tables are available in dual mode.

You can also see that in this demonstration some dimension tables have data for the higher levels in it.

This is because this test was early in our experiments.

We have subsequently found that each dimension table attribute should only be presented out of one table.

And that table should be the level that the attribute is valid for.

Next, I would like to draw your attention to the sale transaction table.

You can see that it has all the normal fields on it like sale extended amount.

In the sales transaction table, you can see there is a detailed key for the sale date, and another detailed key for the product.

Most other keys have been dropped just to make the diagram easier to read.

So, you can clearly understand that if you ask a question to list sales at the day and product level, you will cause Power B I to read the sales transaction table.

Now I would like to bring you attention to the sales transaction summary one table.

You can see that it has detailed keys for month and product group.

You can also see that it is joined to the month dimension and the product group dimension as you would expect.

So, you can clearly understand that if you ask a question at the month and product group level, you will cause Power B I to read this table.

This table is in the Power B I service, so the read of this table will be much faster than a round trip to the sales transaction table.

However, you might want to open your report at the quarter level and product category level.

And then you might want to drill down to the months where the quarter sales are down.

You can summarize the monthly product group to the quarter product category level.

And if you have relatively small volumes, that summarization might be only slightly slower than reading data at that summary level.

But, if you have relatively large volumes, there may be a significant difference between the response time if this summary has to be done at report open time.

So, your report developer might create a summary for you, at the quarter product category level.

You can see that the sales transaction summary six table is at the quarter and product category level.

You can see from the lines on the picture, that it is joined to the quarter table, and the product category table.

Clearly, in this diagram you can see that if you open the report at quarter and product category level, Power B I will only need to read the sales transaction summary six table for the facts.

Then, when you drill down to monthly product group level, Power BI will drill down to the sales transaction summary one table.

And finally, when you drill down to daily sales at product level, Power B I will go to the large fact table in the database, and get the sales transactions for you.

This is called multi level aggregate navigation.

In relational databases, it was originally invented by Ralph Kimball, in the early nineteen eighties.

Now I would like to draw your attention to the manage aggregations window.

You have to tell Power B I how the fields on the summary level tables relate to the fields on the detailed transactions table.

For simplicity we have kept the same names across all levels of summaries.

You can see the aggregate table named is the sales transactions summary six table.

You can see that the sales extended amount on this table is related to the sale extended amount on the sales transaction table.

This is how drill down is implemented.

If you have sales extended amount on your report, and you drill down to the detailed sales transactions, it knows to use the same column name.

It is probably a good idea to name the summary level metrics fields the same names as the detailed transaction metrics fields, but it is not mandatory.

You can also see the field precedence.

The higher this number, the higher the level of summary of the data in the table, and Power B I will use the higher precedence table when it can.

Of course, you can easily understand that there could be a yearly sales table at product total level.

You can easily understand that you can have many levels of summary available to you in such a model, so that every query, at every level, reads the minimum number of rows to perform the query, and to present the data.

The report designer then has to choose how many levels he or she will put into the report, depending on the usage that is expected of the report.

Now I would like to explain how the incremental updates work.

Obviously, you do not want to reload two to five years of summary data every time you refresh your report.

What you want to do is load only those months or quarters where some change has happened for some limited period of history.

For example, your I T people might define a rule that transactions more than six months old may not be changed.

If this rule is implemented, then you only need to check the summary data for the last six months.

But if no change is found in a quarter or a month, then you don’t want that data refreshed to save refresh time, and processing power, for that report.

In the sales transaction summary table, I would like to bring your attention to the field, first day of month.

This field is used to determine which monthly partition this data will be loaded into, in the Power B I service.

The second field I would like to draw your attention to is the audit time stamp one.

If a transaction from some months ago is updated with a new sales extended amount, then the E T L system will update the required summary records in both summary one and summary six.

As part of that update, the E T L must also update the audit time stamp one field for the summary rows affected.

When the full history for these tables is loaded for the first time, Power B I will look for the maximum audit time stamp field one in each partition, and then it will save it in the Power B I service.

When you then ask for a refresh, Power B I will scan the partitions for the maximum audit time stamp, and it will compare the new maximum time stamp to the prior maximum time stamp.

If Power BI finds a later maximum time stamp, it will reload the whole partition for that summary table.

This is automatic, but you do have to tell Power B I how many periods it should go back to check for changes, from the current system date.

This selection of data is done via Power B I reserved parameters, Range Start Date, and Range End Date.

By implementing these two fields into the E T L you get the best of all worlds.

You get the ability to have multi level summary tables.

You get the ability to navigate up and down the table hierarchy.

You get the ability to update only those partitions of data in the Power B I service, where at least one row has changed.

In Summary

Just for completeness so that you understand properly what we are doing.

For BIDA data models all these tables are actually views in our data warehouses.

Because it is very handy to have small amounts of test data in real tables when creating Power B I reports, we create such test databases.

These databases contain all the referential integrity constraints as well.

Then, when the report is ready to be deployed to production, we change the server name and database name to point to the production database, and we deploy the report to the Power B I service.

We have provided a SQL Server Profile trace file of the initial load of this data model.

You can download it by clicking on the button below.

SQL Profile Trace

Anyone who is familiar with SQL and Power B I can read the trace file.

They will easily understand exactly what is happening in this model.

We would like to make one last comment on our research in this area.

We are specialists in MicroStrategy.

We have used MicroStrategy as early as the late nineteen nineties.

Everyone who knows MicroStrategy knows they have a slightly different way to build their data models.

MicroStrategy puts every level of every dimension in it’s own lookup table.

MicroStrategy puts every level of every fact table into its own fact table.

Our multi level data models provide the same functionality as MicroStrategy, just with a lot fewer tables.

It turns out that in doing this research, we have proven that Power B I is able to read MicroStrategy data models directly.

The only two changes needed to a standard MicroStrategy model are the two fields we have had to add, these are described above.

Because of this, we presume Microsoft will now tell all the MicroStrategy accounts that Power B I can read their data warehouses.

And with that?

I would like to say thank you very much for watching our video today.

I really appreciate your time and attention.

We hope you found this blog post interesting and informative.

We would love to hear your comments so please feel free to use the comments section below.

I wish you a great day.

Thank you.

Sofie.

Your BIDA AI Assistant.

We will be emailing our subscribers useful and
valuable information regarding business intelligence.
Please subscribe here to be on our general emailing list.