Free Gyaan: Chapter 1 : Dimensional Modeling Primer

Chapter 1 : Dimensional Modeling Primer

Basic Elements of Data Warehouse

ODS
In either scenario, the ODS can be either a third physical system sitting between
the operational systems and the data warehouse or a specially administered hot
partition of the data warehouse itself. Every organization obviously needs
Dimensional Modeling Primer 15
operational systems. Likewise, every organization would benefit from a data
warehouse. The same cannot be said about a physically distinct ODS unless the
other two systems cannot answer your immediate operational questions.
Clearly, you shouldn’t allocate resources to construct a third physical system
unless your business needs cannot be supported by either the operational datacollection
system or the data warehouse. For these reasons, we believe that the
trend in data warehouse design is to deliver the ODS as a specially administered

portion of the conventional data warehouse.

Cube
If the presentation area is based on a relational database, then these dimensionally
modeled tables are referred to as star schemas. If the presentation area
is based on multidimensional database or online analytic processing (OLAP)
technology, then the data is stored in cubes. While the technology originally
wasn’t referred to as OLAP, many of the early decision support system vendors
built their systems around the cube concept, so today’s OLAP vendors
naturally are aligned with the dimensional approach to data warehousing.
Dimensional modeling is applicable to both relational and multidimensional
databases. Both have a common logical design with recognizable dimensions;
however, the physical implementation differs. Fortunately, most of the recommendations
in this book pertain, regardless of the database platform. While
the capabilities of OLAP technology are improving continuously, at the time of
this writing, most large data marts are still implemented on relational databases.
In addition, most OLAP cubes are sourced from or drill into relational
dimensional star schemas using a variation of aggregate navigation

Facts

Additivity is crucial because data warehouse applications almost never

retrieve a single fact table row. Rather, they bring back hundreds, thousands,

or even millions of fact rows at a time, and the most useful thing to do with so

many rows is to add them up. In Figure 1.2, no matter what slice of the database

the user chooses, we can add up the quantities and dollars to a valid total.

We will see later in this book that there are facts that are semiadditive and still

others that are nonadditive. Semiadditive facts can be added only along some

of the dimensions, and nonadditive facts simply can’t be added at all. With

nonadditive facts we are forced to use counts or averages if we wish to summarize

the rows or are reduced to printing out the fact rows one at a time. This

would be a dull exercise in a fact table with a billion rows.

A row in a fact table corresponds to a measurement. A measurement is a row in a

fact table. All the measurements in a fact table must be at the same grain.

As we develop the examples in this book, we will see that all fact table grains

fall into one of three categories: transaction, periodic snapshot, and accumulating

snapshot. Transaction grain fact tables are among the most common

Snoflex Schema

Dimension tables often represent hierarchical relationships in the business. In

our sample product dimension table, products roll up into brands and then

into categories. For each row in the product dimension, we store the brand and

category description associated with each product. We realize that the hierarchical

descriptive information is stored redundantly, but we do so in the spirit

of ease of use and query performance. We resist our natural urge to store only

the brand code in the product dimension and create a separate brand lookup

table. This would be called a snowflake. Dimension tables typically are highly

denormalized.

Start Schema

Now that we understand fact and dimension tables, let’s bring the two building

blocks together in a dimensional model. As illustrated in Figure 1.4, the

fact table consisting of numeric measurements is joined to a set of dimension

tables filled with descriptive attributes. This characteristic starlike structure is

often called a star join schema. This term dates back to the earliest days of relational

databases.

Free Gyaan

Pages

Monday, 29 December 2014

Chapter 1 : Dimensional Modeling Primer

No comments:

Post a Comment