Back to List

Flattening the 7-layer Recursive Hierarchy Salad

Bob Charapata Bob Charapata  |  
Apr 02, 2019
 

Sometimes organizations must model a hierarchy with data, but they don’t know how deep it will be. Developers often create recursive hierarchy tables for transaction processing systems to solve this problem. Those tables have one column on the table that refers to the table's identity column to create a parent-child relationship. (Typical situations for this may be organization reporting structures or sales locations.) A recursive hierarchy table will look something like this (using table declaration for ease of use in this article):

DECLARE @RecursiveHierarchy TABLE
(
    [ID]            int,
    [HierarchyName] varchar(50),
    [SortID]        int,
    [ParentID]      int
);

In addition to the ID and the column that refers to the Parent ID, there’s a SortID column that allows ordering of the hierarchy. There's also a HierarchyName which will serve as the pivot column later.

Having a recursive hierarchy with an unknown depth works well for transaction processing. Yet it does not lend itself well to browsing the hierarchy in analytical models. Many analytical tools allow only the selection of named columns. This means that hierarchy levels are better displayed as columns for analytical purposes.

The data transformation should perform well, so use two stacked recursive Common Table Expressions (CTE) and pivot on the resulting data to flatten the hierarchy.

Using the table structure declared earlier, let’s assume the following sample data. The top level parent of the hierarchy has a ParentID of \ to identify that row as the root of the hierarchy.

-- R = Root
-- L = Layer
-- I = Item

INSERT @RecursiveHierarchy
(   [ID],
    [HierarchyName],
    [SortID],
    [ParentID]  )
VALUES
    (1, 'R1', 1, NULL),
    (2, 'R2', 2, NULL),
    (3, 'R3', 3, NULL),
    (4, 'R1 L1 I2', 2, 1),
    (5, 'R1 L1 I1', 1, 1),
    (6, 'R2 L1 I2', 2, 2),
    (7, 'R2 L1 I1', 1, 2),
    (8, 'R1 L2 I2', 2, 5),
    (9, 'R1 L2 I1', 1, 5),
    (10, 'R2 L2 I1', 1, 7),
    (11, 'R1 L3 I2', 2, 9),
    (12, 'R1 L3 I1', 1, 9),
    (13, 'R2 L3 I1', 1, 10),
    (14, 'R1 L4 I2', 2, 12),
    (15, 'R1 L4 I1', 1, 12),
    (16, 'R1 L5 I2', 2, 15),
    (17, 'R1 L5 I1', 1, 15),
    (18, 'R1 L6 I1', 1, 17),
    (19, 'R1 L7 I2', 2, 18),
    (20, 'R1 L7 I1', 1, 18);

First, determine the greatest depth of the sample data's hierarchy using the first recursive CTE. This CTE adds the depth of each member of the hierarchy. It sets the root member(s) of the hierarchy to depth 0 and adding 1 to every level on recursion.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID])
     SELECT MAX([hd].[Depth])
     FROM [HierarchyDepth] AS [hd];

For the sample data, the greatest depth is 7. This depth will be used later for the number of columns pivoted.

Once current depth of the hierarchy is determined, another recursive CTE is added below the original CTE to create a mapping for every record to all ancestors of that record. For example, record ID 20 will have 8 records because it has parent ID 18, which has parent ID 17, which has parent ID 15, and so on up to single ancestor ID 1. (One record for itself, and 7 records to each ancestor.) Also, this second CTE subtracts 1 to assign the correct depth for each record-ancestor mapping.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID]),
     [Ancestor]([ID],
                   [ParentID],
                   [Depth])
     AS (SELECT [dh].[ID],
                [dh].[ParentID],
                [dh].[Depth]
         FROM [HierarchyDepth] AS [dh]
         UNION ALL
         SELECT [a].[ID],
                [rh].[ParentID],
                [a].[Depth] - 1
         FROM [Ancestor] AS [a]
         JOIN @RecursiveHierarchy AS [rh]
         ON [rh].[ID] = [a].[ParentID])
     SELECT *
     FROM [Ancestor];

Finally, the HierarchyName column is added back to the two CTEs so there is a column to pivot on. The final select with a pivot to flatten the 7 layers of the hierarchy is joined to the original table to return the ID of each record, the hierarchy name, and the flattened hierarchy.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [HierarchyName],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                [rh].[HierarchyName],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [rh].[HierarchyName],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID]),
     [Ancestor]([ID],
                   [HierarchyName],
                   [ParentID],
                   [Depth])
     AS (SELECT [dh].[ID],
                [dh].[HierarchyName],
                [dh].[ParentID],
                [dh].[Depth]
         FROM [HierarchyDepth] AS [dh]
         UNION ALL
         SELECT [a].[ID],
                [rh].[HierarchyName],
                [rh].[ParentID],
                [a].[Depth] - 1
         FROM [Ancestor] AS [a]
         JOIN @RecursiveHierarchy AS [rh]
         ON [rh].[ID] = [a].[ParentID])
     SELECT [pn].[ID],
            REPLICATE('- ', [hd].[Depth]) + [hd].[HierarchyName] AS [HierarchyName],
            [hd].[Depth],
            [pn].[0] AS [Root],
            [pn].[1] AS [Level1],
            [pn].[2] AS [Level2],
            [pn].[3] AS [Level3],
            [pn].[4] AS [Level4],
            [pn].[5] AS [Level5],
            [pn].[6] AS [Level6],
            [pn].[7] AS [Level7]
     FROM
     (
         SELECT [a].[ID],
                [a].[HierarchyName],
                [a].[Depth]
         FROM [Ancestor] AS [a]
     ) AS [hn] PIVOT(MAX([hn].[HierarchyName]) FOR [hn].[Depth] IN([0],
                                                                   [1],
                                                                   [2],
                                                                   [3],
                                                                   [4],
                                                                   [5],
                                                                   [6],
                                                                   [7])) AS [pn]
     JOIN [HierarchyDepth] AS [hd]
     ON [pn].[ID] = [hd].[ID]

Here is the final result set:

ID    HierarchyName             Depth  Root  Level1       Level2       Level3       Level4       Level5       Level6       Level7
----- ------------------------- ------ ----- ------------ ------------ ------------ ------------ ------------ ------------ ------------
1     R1                        0      R1    NULL         NULL         NULL         NULL         NULL         NULL         NULL
2     R2                        0      R2    NULL         NULL         NULL         NULL         NULL         NULL         NULL
3     R3                        0      R3    NULL         NULL         NULL         NULL         NULL         NULL         NULL
6     - R2 L1 I2                1      R2    R2 L1 I2     NULL         NULL         NULL         NULL         NULL         NULL
7     - R2 L1 I1                1      R2    R2 L1 I1     NULL         NULL         NULL         NULL         NULL         NULL
10    - - R2 L2 I1              2      R2    R2 L1 I1     R2 L2 I1     NULL         NULL         NULL         NULL         NULL
13    - - - R2 L3 I1            3      R2    R2 L1 I1     R2 L2 I1     R2 L3 I1     NULL         NULL         NULL         NULL
4     - R1 L1 I2                1      R1    R1 L1 I2     NULL         NULL         NULL         NULL         NULL         NULL
5     - R1 L1 I1                1      R1    R1 L1 I1     NULL         NULL         NULL         NULL         NULL         NULL
8     - - R1 L2 I2              2      R1    R1 L1 I1     R1 L2 I2     NULL         NULL         NULL         NULL         NULL
9     - - R1 L2 I1              2      R1    R1 L1 I1     R1 L2 I1     NULL         NULL         NULL         NULL         NULL
11    - - - R1 L3 I2            3      R1    R1 L1 I1     R1 L2 I1     R1 L3 I2     NULL         NULL         NULL         NULL
12    - - - R1 L3 I1            3      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     NULL         NULL         NULL         NULL
14    - - - - R1 L4 I2          4      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I2     NULL         NULL         NULL
15    - - - - R1 L4 I1          4      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     NULL         NULL         NULL
16    - - - - - R1 L5 I2        5      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I2     NULL         NULL
17    - - - - - R1 L5 I1        5      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     NULL         NULL
18    - - - - - - R1 L6 I1      6      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     NULL
19    - - - - - - - R1 L7 I2    7      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     R1 L7 I2
20    - - - - - - - R1 L7 I1    7      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     R1 L7 I1

The dashes are added in the final statement to show depth of each level of the hierarchy. Typically, these wouldn't be included when extracting data for consumption by an analytical system. Notice in the above SQL statement that the data is pivoted on all 7 levels of the hierarchy. Also, the 7 newly pivoted columns are returned in the final select statement from the CTE. This can be adjusted based off the analyzed depth of the recursive hierarchy.

Finally, notice that the results are not displayed in proper hierarchical order. This is not a concern for extracted data that will be consumed by an analytical system. However, this may be useful for other purposes.

I have shown how to use recursive CTE's and a pivot structure to transform a recursive hierarchy into one that displays the hierarchy across columns. The query structure performs well because it does not loop through the data to transform the structure. The next time you find yourself extracting recursive hierarchy to analysis models, consider using this method.

Stay tuned for Part 2 of this blog that shows how we can display the results in the correct hierarchical order.

Data Analytics

 

Love our Blogs?

Sign up to get notified of new Skyline posts.

 


Related Content


Blog Article
Data Lake vs Data Warehouse: Pros and Cons
Scott HietpasScott Hietpas  |  
Jun 18, 2019
In this blog series, Scott Hietpas, a principal consultant with Skyline Technologies’ data team, responds to some common questions on data warehouses and data lakes. For a full overview on this topic, check out the original Data Lake vs Data Warehouse webinar.   There's a lot of...
Blog Article
Machine Monitoring IoT Solution with Azure Services and Power BI
Eric SaltzmannEric Saltzmann  |  
Jun 11, 2019
We often hear organizations ask how they can drive more insights out of their connected devices. Though the Internet of Things (IoT) has been a buzzword for the last few years, many organizations are still struggling through the headache of implementing an IoT pilot or solution. Most of the...
Blog Article
10 KPIs Manufacturers Should Track for Operational Excellence
Paul FullerPaul Fuller  |  
Apr 18, 2019
How do you know if you’re truly improving quality and efficiency in your manufacturing operations? Do you know if your equipment is as effective as you think it is? Are your operating lines a bottleneck in getting orders delivered to your customers? How would you demonstrate that?  ...
Blog Article
Sorting Results in the Flattened 7-layer Recursive Hierarchy Salad
Bob CharapataBob Charapata  |  
Apr 16, 2019
In my previous article about Flattening a Recursive Hierarchy, I wrote about an approach that transforms existing recursive hierarchies into usable data constructs for analytics. This post builds on that article to show how to display the results in correct hierarchical order.After...
Blog Article
How to Use Power BI’s New AI Visual: Key Influencers
Marcus RadueMarcus Radue  |  
Mar 28, 2019
Microsoft has recently released a new Key Influencers visual in their February 2019 release of Power BI. This visual is part of Microsoft’s roadmap to continue to advance the Artificial Intelligence (AI) integration and features within Power BI. Microsoft has already introduced other AI...