What are fact tables? | Why work with fact tables? | Different types of fact tables
In this article, Notitia’s Adelaide-based Analytics Consultant, Guilherme Matte, runs us through the ins and outs of fact tables. He explains what fact tables are, the different types of fact tables and why we work with them.
First let’s start with some definitions.
A data warehouse is a centralised repository that collects, integrates, and stores large volumes of structured data from various sources within an organisation.
An integral component of a data warehouse are fact tables.
Fact tables are very important to businesses (and the people in their business) who use, analyse and report on key performance indicators (KPIs), sales metrics, customer behaviour, and other quantifiable aspects that help to improve operations. By centralising and organising this data, fact tables provide a foundation for extracting actionable insights.
Why fact tables?
A fact table, in a data warehouse, is a central table that stores quantitative data, typically numeric measures. It forms the core of analytical processes by linking with dimension tables to provide context for comprehensive business intelligence.
Working with fact tables is crucial for anyone who uses data analysis or business intelligence to make informed decisions, optimise business processes and plan strategically in an organisation.
Fact tables play a central role in organising and storing quantitative data and providing a foundation for analytical processes.
Let’s explore how data warehousing and fact tables work together to collect, integrate and store your data - enabling the organisation and analysis of complex datasets.
Fact tables and dimensional models
In data warehousing, a fact table is one of the key concepts in a dimensional model.
It is used to store quantitative data for analysis and often serves as the focal point of a data warehouse.
Fact tables typically contain numeric measures and are used in conjunction with dimension tables, which contain descriptive attributes to provide context to the data. Whenever developing a new solution for a client, it is essential to understand the data and correctly model the fact tables.
This is crucial for enabling further analytics within BI (Business Intelligence) tools.
1. Transactional fact tables: These tables record data related to specific events or transactions. Each row in a transactional fact table represents an individual event at a point in time.
For instance, in a retail context, each row could represent a sale, capturing data such as sale amount, date, and the IDs of related dimensions such as product, customer, and store.
Aggregating data in a transactional fact table is relatively straightforward. For instance, to determine the total sales amount, one can easily sum up the values in the Sales Amount column. This approach directly yields the correct total, which can then be seamlessly integrated into data visualisation tools for analysis.
2. Periodic snapshot fact tables: These tables capture data at regular intervals, summarising activities over a specified period.
They are useful for tracking data that accumulates over time, such as monthly sales totals or quarterly inventory levels. Each row represents a summary of the data for a particular period.
Filtering "Store 01" note how each row repeats itself for each quarter per "Product ID". In the periodic snapshot, we can estimate the table growth based on its granularity and periodicity, in this case, each "Product ID" will generate a new row per quarter for each "Store".
Aggregating data in a periodic snapshot fact table requires a different approach compared to transactional fact tables. In periodic snapshots, each row typically represents aggregated data for a set time frame, such as a week, month, and quarter.
To analyse trends or perform comparisons over time, one can examine the data within these individual periods. However, caution is needed when summing values across multiple rows, as this could lead to double-counting or misinterpretation, especially with non-cumulative metrics.
The focus with periodic snapshots is often on understanding patterns and changes within each discrete period, rather than aggregating across them. In the above example would make no sense to sum inventory quantities across periods.
3. Accumulating snapshot fact tables: These tables are used to show the activity of a process that has a clear beginning and end, such as an order fulfilment or manufacturing process.
Rows in an accumulating snapshot fact table are updated as the process moves through different stages. For example, an order might move from placed, to shipped, to delivered, and the corresponding row in the table would be updated at each stage.
Each fact table variant serves distinct analysis and reporting purposes
Transactional tables excel in providing granular, event-specific insights. Periodic snapshot tables are ideal for analysing trends over defined intervals while accumulating snapshot tables are adept at monitoring the stages of a process from start to finish. Often, in practice, these types of fact tables are employed together for the same business process. This allows them to complement each other, yielding more comprehensive and effective outcomes and enabling more in-depth analytics.
Guilherme Matte, Notitia's Adelaide-Based Data Analytics Consultant
Gui says it's the thrill of a challenge, new technologies and working with clients that makes what he does so rewarding.
With a degree in Mechanical Engineering, along with a Master of Business Administration and Master of Business Information Systems, he brings a wealth of experience to the team and passion to every project.
Read more about Gui here, or book in a chat with him directly via Notitia's team page.