Back to List

Flattening the 7-layer Recursive Hierarchy Salad

Bob Charapata Bob Charapata  |  
Apr 02, 2019
 

Sometimes organizations must model a hierarchy with data, but they don’t know how deep it will be. Developers often create recursive hierarchy tables for transaction processing systems to solve this problem. Those tables have one column on the table that refers to the table's identity column to create a parent-child relationship. (Typical situations for this may be organization reporting structures or sales locations.) A recursive hierarchy table will look something like this (using table declaration for ease of use in this article):

DECLARE @RecursiveHierarchy TABLE
(
    [ID]            int,
    [HierarchyName] varchar(50),
    [SortID]        int,
    [ParentID]      int
);

In addition to the ID and the column that refers to the Parent ID, there’s a SortID column that allows ordering of the hierarchy. There's also a HierarchyName which will serve as the pivot column later.

Having a recursive hierarchy with an unknown depth works well for transaction processing. Yet it does not lend itself well to browsing the hierarchy in analytical models. Many analytical tools allow only the selection of named columns. This means that hierarchy levels are better displayed as columns for analytical purposes.

The data transformation should perform well, so use two stacked recursive Common Table Expressions (CTE) and pivot on the resulting data to flatten the hierarchy.

Using the table structure declared earlier, let’s assume the following sample data. The top level parent of the hierarchy has a ParentID of \ to identify that row as the root of the hierarchy.

-- R = Root
-- L = Layer
-- I = Item

INSERT @RecursiveHierarchy
(   [ID],
    [HierarchyName],
    [SortID],
    [ParentID]  )
VALUES
    (1, 'R1', 1, NULL),
    (2, 'R2', 2, NULL),
    (3, 'R3', 3, NULL),
    (4, 'R1 L1 I2', 2, 1),
    (5, 'R1 L1 I1', 1, 1),
    (6, 'R2 L1 I2', 2, 2),
    (7, 'R2 L1 I1', 1, 2),
    (8, 'R1 L2 I2', 2, 5),
    (9, 'R1 L2 I1', 1, 5),
    (10, 'R2 L2 I1', 1, 7),
    (11, 'R1 L3 I2', 2, 9),
    (12, 'R1 L3 I1', 1, 9),
    (13, 'R2 L3 I1', 1, 10),
    (14, 'R1 L4 I2', 2, 12),
    (15, 'R1 L4 I1', 1, 12),
    (16, 'R1 L5 I2', 2, 15),
    (17, 'R1 L5 I1', 1, 15),
    (18, 'R1 L6 I1', 1, 17),
    (19, 'R1 L7 I2', 2, 18),
    (20, 'R1 L7 I1', 1, 18);

First, determine the greatest depth of the sample data's hierarchy using the first recursive CTE. This CTE adds the depth of each member of the hierarchy. It sets the root member(s) of the hierarchy to depth 0 and adding 1 to every level on recursion.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID])
     SELECT MAX([hd].[Depth])
     FROM [HierarchyDepth] AS [hd];

For the sample data, the greatest depth is 7. This depth will be used later for the number of columns pivoted.

Once current depth of the hierarchy is determined, another recursive CTE is added below the original CTE to create a mapping for every record to all ancestors of that record. For example, record ID 20 will have 8 records because it has parent ID 18, which has parent ID 17, which has parent ID 15, and so on up to single ancestor ID 1. (One record for itself, and 7 records to each ancestor.) Also, this second CTE subtracts 1 to assign the correct depth for each record-ancestor mapping.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID]),
     [Ancestor]([ID],
                   [ParentID],
                   [Depth])
     AS (SELECT [dh].[ID],
                [dh].[ParentID],
                [dh].[Depth]
         FROM [HierarchyDepth] AS [dh]
         UNION ALL
         SELECT [a].[ID],
                [rh].[ParentID],
                [a].[Depth] - 1
         FROM [Ancestor] AS [a]
         JOIN @RecursiveHierarchy AS [rh]
         ON [rh].[ID] = [a].[ParentID])
     SELECT *
     FROM [Ancestor];

Finally, the HierarchyName column is added back to the two CTEs so there is a column to pivot on. The final select with a pivot to flatten the 7 layers of the hierarchy is joined to the original table to return the ID of each record, the hierarchy name, and the flattened hierarchy.

WITH [HierarchyDepth]([ID],
                      [ParentID],
                      [HierarchyName],
                      [Depth])
     AS (SELECT [rh].[ID],
                [rh].[ParentID],
                [rh].[HierarchyName],
                0
         FROM @RecursiveHierarchy AS [rh]
         WHERE [rh].[ParentID] IS NULL
         UNION ALL
         SELECT [rh].[ID],
                [rh].[ParentID],
                [rh].[HierarchyName],
                [dh].[Depth] + 1
         FROM @RecursiveHierarchy AS [rh]
         JOIN [HierarchyDepth] AS [dh]
         ON [dh].[ID] = [rh].[ParentID]),
     [Ancestor]([ID],
                   [HierarchyName],
                   [ParentID],
                   [Depth])
     AS (SELECT [dh].[ID],
                [dh].[HierarchyName],
                [dh].[ParentID],
                [dh].[Depth]
         FROM [HierarchyDepth] AS [dh]
         UNION ALL
         SELECT [a].[ID],
                [rh].[HierarchyName],
                [rh].[ParentID],
                [a].[Depth] - 1
         FROM [Ancestor] AS [a]
         JOIN @RecursiveHierarchy AS [rh]
         ON [rh].[ID] = [a].[ParentID])
     SELECT [pn].[ID],
            REPLICATE('- ', [hd].[Depth]) + [hd].[HierarchyName] AS [HierarchyName],
            [hd].[Depth],
            [pn].[0] AS [Root],
            [pn].[1] AS [Level1],
            [pn].[2] AS [Level2],
            [pn].[3] AS [Level3],
            [pn].[4] AS [Level4],
            [pn].[5] AS [Level5],
            [pn].[6] AS [Level6],
            [pn].[7] AS [Level7]
     FROM
     (
         SELECT [a].[ID],
                [a].[HierarchyName],
                [a].[Depth]
         FROM [Ancestor] AS [a]
     ) AS [hn] PIVOT(MAX([hn].[HierarchyName]) FOR [hn].[Depth] IN([0],
                                                                   [1],
                                                                   [2],
                                                                   [3],
                                                                   [4],
                                                                   [5],
                                                                   [6],
                                                                   [7])) AS [pn]
     JOIN [HierarchyDepth] AS [hd]
     ON [pn].[ID] = [hd].[ID]

Here is the final result set:

ID    HierarchyName             Depth  Root  Level1       Level2       Level3       Level4       Level5       Level6       Level7
----- ------------------------- ------ ----- ------------ ------------ ------------ ------------ ------------ ------------ ------------
1     R1                        0      R1    NULL         NULL         NULL         NULL         NULL         NULL         NULL
2     R2                        0      R2    NULL         NULL         NULL         NULL         NULL         NULL         NULL
3     R3                        0      R3    NULL         NULL         NULL         NULL         NULL         NULL         NULL
6     - R2 L1 I2                1      R2    R2 L1 I2     NULL         NULL         NULL         NULL         NULL         NULL
7     - R2 L1 I1                1      R2    R2 L1 I1     NULL         NULL         NULL         NULL         NULL         NULL
10    - - R2 L2 I1              2      R2    R2 L1 I1     R2 L2 I1     NULL         NULL         NULL         NULL         NULL
13    - - - R2 L3 I1            3      R2    R2 L1 I1     R2 L2 I1     R2 L3 I1     NULL         NULL         NULL         NULL
4     - R1 L1 I2                1      R1    R1 L1 I2     NULL         NULL         NULL         NULL         NULL         NULL
5     - R1 L1 I1                1      R1    R1 L1 I1     NULL         NULL         NULL         NULL         NULL         NULL
8     - - R1 L2 I2              2      R1    R1 L1 I1     R1 L2 I2     NULL         NULL         NULL         NULL         NULL
9     - - R1 L2 I1              2      R1    R1 L1 I1     R1 L2 I1     NULL         NULL         NULL         NULL         NULL
11    - - - R1 L3 I2            3      R1    R1 L1 I1     R1 L2 I1     R1 L3 I2     NULL         NULL         NULL         NULL
12    - - - R1 L3 I1            3      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     NULL         NULL         NULL         NULL
14    - - - - R1 L4 I2          4      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I2     NULL         NULL         NULL
15    - - - - R1 L4 I1          4      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     NULL         NULL         NULL
16    - - - - - R1 L5 I2        5      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I2     NULL         NULL
17    - - - - - R1 L5 I1        5      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     NULL         NULL
18    - - - - - - R1 L6 I1      6      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     NULL
19    - - - - - - - R1 L7 I2    7      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     R1 L7 I2
20    - - - - - - - R1 L7 I1    7      R1    R1 L1 I1     R1 L2 I1     R1 L3 I1     R1 L4 I1     R1 L5 I1     R1 L6 I1     R1 L7 I1

The dashes are added in the final statement to show depth of each level of the hierarchy. Typically, these wouldn't be included when extracting data for consumption by an analytical system. Notice in the above SQL statement that the data is pivoted on all 7 levels of the hierarchy. Also, the 7 newly pivoted columns are returned in the final select statement from the CTE. This can be adjusted based off the analyzed depth of the recursive hierarchy.

Finally, notice that the results are not displayed in proper hierarchical order. This is not a concern for extracted data that will be consumed by an analytical system. However, this may be useful for other purposes.

I have shown how to use recursive CTE's and a pivot structure to transform a recursive hierarchy into one that displays the hierarchy across columns. The query structure performs well because it does not loop through the data to transform the structure. The next time you find yourself extracting recursive hierarchy to analysis models, consider using this method.

Stay tuned for Part 2 of this blog that shows how we can display the results in the correct hierarchical order.

Data Analytics

 

Love our Blogs?

Sign up to get notified of new Skyline posts.

 


Related Content


Blog Article
10 KPIs Manufacturers Should Track for Operational Excellence
Paul FullerPaul Fuller  |  
Apr 18, 2019
How do you know if you’re truly improving quality and efficiency in your manufacturing operations? Do you know if your equipment is as effective as you think it is? Are your operating lines a bottleneck in getting orders delivered to your customers? How would you demonstrate that?  ...
Blog Article
Sorting Results in the Flattened 7-layer Recursive Hierarchy Salad
Bob CharapataBob Charapata  |  
Apr 16, 2019
In my previous article about Flattening a Recursive Hierarchy, I wrote about an approach that transforms existing recursive hierarchies into usable data constructs for analytics. This post builds on that article to show how to display the results in correct hierarchical order.After...
Blog Article
How to Use Power BI’s New AI Visual: Key Influencers
Marcus RadueMarcus Radue  |  
Mar 28, 2019
Microsoft has recently released a new Key Influencers visual in their February 2019 release of Power BI. This visual is part of Microsoft’s roadmap to continue to advance the Artificial Intelligence (AI) integration and features within Power BI. Microsoft has already introduced other AI...
Blog Article
Useful Tips for Power BI Desktop
Scott HietpasScott Hietpas  |  
Mar 19, 2019
There are several things within the Power BI desktop resources that I want to point out. One of the best things that happened in a few releases back is the help menu in the Desktop app. Within the help menu is access to Microsoft documentation and videos. Power BI’s blog speaks about this...
Blog Article
How to Retrieve Data from Excel Files and Other Data Sources in Power BI
Scott HietpasScott Hietpas  |  
Mar 05, 2019
Working with OneDrive for business is great for keeping copies of Power BI desktop reports. It can also be a good place for you to store any one-off data or backup versions of data sources in case the files break at some point.   Another reason to save Excel files in OneDrive is so Power...