sql scd type2 dailybitalks.com

How to Write SQL for Slowly Changing Dimensions (SCD Type 2)

When building data warehouses, you’ll often face a classic challenge: how to track changes in dimension data over time. For example, what if a customer moves to a new city? Should you overwrite their old address or keep the historical record?

That’s where Slowly Changing Dimensions (SCD) come in. In particular, SCD Type 2 is one of the most commonly used methods because it allows analysts to maintain historical accuracy while still tracking current data.

In this article, we’ll explain what SCD Type 2 is, why it’s important, and provide step-by-step SQL examples you can adapt to your own BI projects.


What is SCD Type 2?

A Slowly Changing Dimension (SCD) is a technique used in data warehousing to handle changes in dimension attributes over time.

  • Type 1: Overwrites old data (no history kept).
  • Type 2: Keeps history by creating new records when changes occur.
  • Type 3: Keeps limited history using extra columns (e.g., “previous city”).

👉 SCD Type 2 is the go-to method when historical tracking is required, such as for compliance, customer analytics, or trend analysis.


Example Scenario

Imagine you have a Customer Dimension table:

CustomerIDCustomerNameCityStartDateEndDateIsCurrent
101Alice BrownNew York2021-01-012022-05-150
101Alice BrownChicago2022-05-16NULL1
  • Alice moved from New York to Chicago in May 2022.
  • Instead of overwriting the city, SCD Type 2 keeps both versions, with dates showing when each record was valid.

Key Columns for SCD Type 2

To implement SCD Type 2 in SQL, you typically add these fields to your dimension table:

  • StartDate – when the record became effective.
  • EndDate – when the record stopped being valid (NULL if still current).
  • IsCurrent (flag) – 1 for current record, 0 for historical.

SQL for SCD Type 2

Step 1: Identify New and Changed Records

You’ll usually compare the staging table (new data load) with the dimension table (existing data).

SELECT s.CustomerID, s.CustomerName, s.City
FROM Staging_Customer s
LEFT JOIN Dim_Customer d 
    ON s.CustomerID = d.CustomerID
   AND d.IsCurrent = 1
WHERE d.CustomerID IS NULL 
   OR s.City <> d.City;

This query finds new customers or existing customers whose city has changed.


Step 2: Expire Old Records

When a change is detected, mark the old record as expired.

UPDATE Dim_Customer
SET EndDate = GETDATE(),
    IsCurrent = 0
FROM Dim_Customer d
JOIN Staging_Customer s
  ON d.CustomerID = s.CustomerID
WHERE d.IsCurrent = 1
  AND s.City <> d.City;

Step 3: Insert New Records

Insert the updated record with a new StartDate and set it as current.

INSERT INTO Dim_Customer (CustomerID, CustomerName, City, StartDate, EndDate, IsCurrent)
SELECT 
    s.CustomerID,
    s.CustomerName,
    s.City,
    GETDATE() AS StartDate,
    NULL AS EndDate,
    1 AS IsCurrent
FROM Staging_Customer s
LEFT JOIN Dim_Customer d
  ON s.CustomerID = d.CustomerID
WHERE d.CustomerID IS NULL
   OR s.City <> d.City;

Best Practices for Analysts & BI Teams

  • Use surrogate keys (e.g., DimCustomerKey) instead of relying only on business keys.
  • Index on CustomerID + IsCurrent for faster lookups.
  • Document business rules (e.g., which attributes trigger history tracking).
  • Validate in BI dashboards by checking record counts and effective dates.

Why SCD Type 2 Matters for BI

For analysts, SCD Type 2 ensures:

  • Historical accuracy – you can ask, “How many customers lived in New York in 2021?”
  • Reliable trend analysis – track changes in attributes over time.
  • Compliance support – keep audit trails for regulated industries.

Without SCD Type 2, reports could show misleading results because past data would reflect current attributes, not historical ones.


Conclusion

SCD Type 2 is the backbone of historical reporting in BI. By adding effective dates and handling changes through SQL, you preserve data history while still providing accurate current-state reporting.

With the SQL examples above, you can confidently implement SCD Type 2 logic in your own data warehouse, ensuring your analytics remain both accurate and trustworthy.


Discover more from Daily BI Talks

Subscribe to get the latest posts sent to your email.