文档库

最新最全的文档下载
当前位置:文档库 > DT133数据仓库设计 Factless Fact Tables Simplification

DT133数据仓库设计 Factless Fact Tables Simplification

DT133数据仓库设计 Factless Fact Tables Simplification

http://www.wendangku.net/doc/06876d47be1e650e52ea9989.html Number 133, April 5, 2011 Design Tip #133 Factless Fact Tables for Simplification

By Bob Becker

It’s been awhile since we talked about factless fact tables in Design Tip #50. You may recall that a factless fact table is “a fact table that has no facts but captures the many-to-many relationship between dimension keys.”

We’ve previously discussed factless fact tables to represent events or coverage information. An event-based factless fact table is student attendance information; the grain of the fact table is one row per student each day. A typical coverage factless fact table in retail includes one row for every item being promoted during the time it is promoted; it is used to help answer the “what didn’t happen?” question identifying items being promoted but not sold.

Factless fact tables can simplify the overall design. Think of a property and casualty insurance company providing automobile coverages. It’s reasonable to create a transaction fact table that captures the written premiums resulting from a new sale or change to an existing policy. Likewise, it makes perfect sense to implement a monthly snapshot fact table to capture the earned premium associated with every policy by customer, by named insured, by household, by vehicle, by driver and so on.

But real life complications jump up and the design seems to get horribly complex in an effort to capture the relationships between dimensions and the changes in these relationships over time. For example, a single driver can relate to multiple vehicles, policies and households. Of course, in a similar manner there can be multiple drivers on a single vehicle, single policy or single household.

This complexity can cause the design team significant headaches. The next thing they know the design is littered with bridge tables. Even then, many of the design challenges are not well suited for a bridge table solution as they require three, four or more dimensions. In the end, the designs get too complex, too hard to understand, and provide poor query performance.

The key to avoiding this situation is recognizing there are multiple business processes at play and designing a solution to include additional fact tables. The problem is many design teams fail to understand there are multiple business processes involved as they cannot visualize the “facts” that would result, forgetting about the factless fact table. Tracking drivers associated to vehicles is not the same business process as recognizing the earned premium related to a policy each month. Recognizing multiple business processes will result in a simplified design for the core transaction and snapshot fact tables surrounded by several factless fact tables that help track the relationship between other dimension tables.

To finish our example, a property and casualty design might include factless fact tables to support: ?Household involved parties – One row per household and insured party with begin and end effective dates.

?Policy involved parties – One row per policy, household, insured item, and insured party with begin and end effective dates.

?Vehicle involved parties – One row per policy, driver and vehicle with begin and end effective dates.