# Saturday, January 15, 2005
« Page Rank | Main | Nerd Score »

In the role I currently do for Exony, it’s a rare day when I get to do some good old fashioned problem solving in code. Fortunately, today was one of those days. The problem I’ve been working at is how to efficiently model user and groups in SQL where a user can be a member of multiple groups and groups can contain both users and other groups. I needed to get a list of all groups a user is either directly or indirectly a member of. In procedural languages this problem is easy with recursion. In SQL it’s a real pain because you need to use a set based solution for optimum performance – cursors are out.

I started with a semi-normalised schema modelling the users, groups and group memberships as show below:

UserGroupMember1

It’s de-normalised because as far as security goes I want to treat users and groups the same so I’m storing both types in the Users table, differentiated by UserType. Unfortunately, the only way to navigate this structure is with procedural code – not good, time for a different approach.

If you draw a sample set of users and groups on a piece of paper with lines representing membership it looks like a forest of trees so I started wondering if I should model the members table in a similar manor. There are a number of well known idioms for storing trees in SQL. The most widely known is Joe Celko’s nested sets model but I prefer a path model because inserts and sub-tree moves are easier (If you fancy finding out how little you know about SQL and why the adjacency list solution is completely wrong then just ask Joe the question in news://microsoft.public.sqlserver.programming). I added a new table for this because the normalised members table is great for storing the essential data and I believed the tree should be a pre-calculated optimisation.

Unfortunately, as soon as I coded this up I ran into a problem – lack of brain power. One of the goals I set was I must be able to recreate the tree from the members table. I’m sure it’s possible but I spent far too much time staring at the ceiling trying to come up with an answer. One thing did stick though – why not maintain a  pre-calculated list of all parent-child relationships in a format similar to the existing members table. As the data is the same as the existing members table, I just added a flag to indicate the row is calculated as shown below:

UserGroupMember2

So the trick now is to fill this table with every possible combination of user and group, all the way up each hierarchy. The SQL actually took a couple of goes until the light went on and I realised I needed to be selecting not parent-child relationships (I already have them in the table as non-calculated rows) but grandparent-child relationships. A single pass will not get all the rows required – you need to do it until no more rows are added. Time for some code:

declare @rowcount int
set @rowcount = 1

while @rowcount > 0
begin
    insert into dbo.Members (ParentID, ChildID, Calculated)
    select distinct p.UserID, c.UserID, 1
    from dbo.Members pm
    join dbo.Users p ON p.UserID = pm.ParentID
    join dbo.Members cm ON cm.ParentID = pm.ChildID
    join dbo.Users c ON c.UserID = cm.ChildID
    where not exists (
        select *
        from dbo.Members
        where ParentID = p.UserID
        and ChildID = c.UserID
    )

    select @rowcount = @@rowcount
end

From here, selecting the complete set of groups any user is a member of is easy:

declare @userid uniqueidentifier

select @userid = UserID
from dbo.Users
where Name = 'Administrator'

select ParentID as UserID
from dbo.Members
where ChildID = @userid
union
select @userid

Just add an index to Members.ChildID and everything is ready to go. All I need to do now is implement the rest of the security system…

[Edit: Because I wrote this late last night after a couple of glasses of wine I had a panic this morning thinking there was a flaw in my thinking so I coded up a test script. It turns out that everything works, but you have to watch for loops in the Members table.]

by This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Saturday, January 15, 2005 12:44:14 AM (GMT Standard Time, UTC+00:00)  #    Comments [1] Trackback
Related posts:
Do you have great business intelligence skills?
Integration Services Design Principals
Physical Data Warehouse Design
Analysis Services Essential Reading
When should you do an incremental extract?
Post TechReady and Popfly Invites
Friday, July 15, 2005 12:34:01 AM (GMT Daylight Time, UTC+01:00)
Oops... forgot to make the UserID the right field...

select distinct pm.ParentID, cm.ChildID, 1
from dbo.Members pm
join dbo.Members cm ON cm.ParentID = pm.ChildID
where not exists (
select *
from dbo.Members em
where em.ParentID = pm.ParentID
and em.ChildID = cm.ChildID
)

Sorry.
Comments are closed.