It's been a while since I posted last due to our current work schedule. It's hard work preparing a new product version when that release is due to go to a customer before it's GA'd.
Anyway, in this post about "Uniqueidentifiers as primary key columns" I talked about some of the benefits to ditching the traditional IDENTITY columns. I did point out that their use in data warehouse applications is questionable as key sizes become important when dealing with huge amounts of data. Herein lies the problem - we have an application that uses uniqueidentifiers wherever possible but dimension tables still use integer keys. No problem so far but we have recently tried to implement parent-child dimensions within Analysis Services and our folder structure is using uniqueidentifiers as keys.
As you may know, Analysis Server uses a simple arrangement for parent-child dimensions where each member has a key column and a parent key column as per the adjacency list model. To implement this, lets assume two tables Office and Folder where Office is a dimension and Folder is a hierarchy into which we place each office. The two tables look like:
create table Folder (
folder_id int not null primary key,
parent_id int not null,
name nvarchar(50) not null
)
create table Office (
office_id int not null primary key,
folder_id int not null,
name nvarchar(50) not null
)
There is a foreign key between folder_id in the two tables to tie them together. Note, I've used integer primary keys here so creating a parent-child dimension is easy as follows:
create view OfficeTree
as
select folder_id + (select max(office_id) from Office) as item_id,
parent_id + (select max(office_id) from Office) as parent_id,
name
from Folder
union all
select office_id,
folder_id + (select max(office_id) from Office),
name
from Office
Note, that the Folder keys must be offset to ensure they don't clash with the Office keys. Now if we try and change the folder_id and parent_id fields in Folder to uniqueidentifiers we end up with a problem - the data types are not compatible when creating the view OfficeTree and you get the message "Operand type clash: uniqueidentifier is incompatible with int".
The solution to this problem is of course to either go back to using ints somehow. I was about ready to remove the uniqueidentifiers when Koan Bremner came up with the following solution. The trick is to realize that the references between the two tables are only required when processing the dimension and Analysis Server doesn't care if the IDs change between cycles just as long as the keys joined to fact tables remain constant. So we still need to add an integer key column to the Folder table as below:
create table Folder (
folder_id uniqueidentifier not null primary key,
parent_id uniqueidentifier not null,
alt_id int null,
name nvarchar(50) not null
)
We allow the column to be null because the keys are only required before we process the dimension and allowing NULLs ensures easy inserts in a replicated environment. To update the column before processing requires a little magic (again from Koan).
declare @counter int
select @counter = isnull(max(alt_id), 0)
from Folder
update Folder
set @counter = alt_id = @counter + 1 -- Ed: Typo fixed (see comments).
where alt_id is null
option (maxdop 1)
The maxdop option ensures a sequential update to the column which forces an incrementing value. Now we can rewrite the view to use the new field. Note that we have to join back to Folder in order to resolve the parent_id field.
create view OfficeTree
as
select f1.alt_id + (select max(office_id) from Office) as item_id,
f2.alt_id + (select max(office_id) from Office) as parent_id,
f1.name
from Folder f1
left join Folder f2 on f2.folder_id = f1.parent_id
union all
select o1.office_id,
f2.alt_id + (select max(office_id) from Office),
o1.name
from Office o1
left join Folder f2 on f2.folder_id = o1.folder_id