# Tuesday, June 30, 2009

ssisdataflowsample Whilst doing some design work today for a customer project I realised there are a set of principals I try and adhere to when creating SQL Server Integration Services packages. The list is no doubt incomplete but this is what I have so far.

Minimise IO

This is a general data processing principal. Usually disk and, to a lesser extent, network performance determine the overall processing speed. Reducing the amount of IO in a solution will therefore increase performance.

Solutions that consist of multiple read-process-write steps should be redesigned into a single read-process-process-process-write step.

Prefer Sequential IO to Random IO

Disks perform at their best when sequentially reading or writing large chunks of data. Random IO (and poor performance) manifests when procedural style programming occurs - signs to look out for are SQL statements modifying/returning only few rows but being executed repeatedly.

Watch out for hidden random IO - for example, if you are reading from one table and writing to another in a sequential manor then disk access will still be random if both tables are stored on the same spindles.

Avoid data flow components that pool data

Data flow components work on batches of data called buffers. In most instances buffers are modified in place and passed down stream. Some components, such as "Sort" cannot process data like this and effectively hang on to buffers until the entire data stream is in memory (or spooled to disk in low memory situations). This increased memory pressure will affect performance.

Sometimes SQL is the better solution

Whilst the SSIS data flow has lots of useful and flexible components, it is sometimes more efficient to perform the equivalent processing in a SQL batch. SQL Server is extremely good at sorting, grouping and data manipulation (insert, update, delete) so it is unlikely you will match it for raw performance on a single read-process-write step.

SSIS does not handle hierarchical data well

Integration Services is a tabular data processing system. Buffers are tabular and the components and associated APIs are tabular. Consequently it is difficult to process hierarchical data such as the contents of an XML document. There is an XML source component but it's output is a collection of tabular data streams that need to joined to make sense.

Execute SSIS close to where you wish to write your data

Reading data is relatively easy and possible from a wide variety of locations. Writing data, on the other hand, can involve complex locking and other issues which are difficult to optimise on a network protocol. In particular when writing data to a local SQL Server instance, SSIS automatically used the Shared Memory transport for direct inter-process transfer.

Don't mess with the data flow metadata at runtime

It's very difficult to do this anyway but worth mentioning that SSIS gets it's stellar performance from being able to setup a data flow at runtime safe in the knowledge that buffers are of a fixed format and component dependencies will not change.

The only time this is acceptable is when you need to build a custom data flow programmatically. You should use the SSIS API's and not attempt to write the package XML directly.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Tuesday, June 30, 2009 7:23:10 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Tuesday, June 23, 2009
Plaza Mayor World
Plaza Mayor World, uploaded to Flickr by James Snape.

Canon 400D, Canon EF-S 10-22mm f/3.5-4.5 lens - 1/125 sec, f/9, ISO 100

This is effectively a panorama shot wrapped around to look like a mini planet. I almost have it perfected but one of the things you must do is ensure there are no objects in the top or bottom of frame when you shoot them. The lamp post and tower just managed to fall in the zone when wrapped.

It was taken in Madrid two weeks ago when Claire and I celebrated our one year anniversary.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Tuesday, June 23, 2009 3:39:48 PM (GMT Daylight Time, UTC+01:00)  #    Comments [2] Trackback
# Wednesday, June 10, 2009

There are a lot of new features coming in the next release of Visual Studio and related products. They are easily discoverable but I thought it would be helpful to bring as many videos and screen casts as I could find into a single list for easy consumption.

A word of warning though, I can’t be held responsible for the feelings brought on by the realisation that you can’t use any of this stuff in a supported way right now. Just try to relax and think how much better, faster and easier your job will be when it is finally released.

General

Microsoft Visual Studio Team System: A Lap Around VSTS 2010

In the spirit of an agile sprint, see how to use the next version of Visual Studio Team System to manage user stories and re-factor existing architecture. Learn how to diagnose real production problems, debug in-production virtual labs, capture test data to eliminate the no-repro bugs, transparently plan, monitor, and adapt software projects.

Team Foundation Server 2010: Cool New Features

Dive deep into the next version of Team Foundation Server (TFS), and learn how TFS has factored its learning's about usability, industrial scale, geographic distribution, manageability, and development process into the next version of the product. See a demonstration of build automation, policy checks, parallel development, new project planning and tracking features, such as agile planning, end to end traceability, reporting, and dashboards, administration and ops --all designed to improve transparency and velocity for teams from size 5 through 50,000.

A first look at Visual Studio Team System Web Access 2010

Visual Studio Team System Web Access has become an increasingly popular way for people to access Team Foundation Server. In this interview we meet Hakan Eskici who demonstrates some of the upcoming features his team is working on for Visual Studio Team System Web Access 2010.

Project Management

Agile Planning Templates in Visual Studio Team System 2010

Stephanie Saad shows us a quick demonstration of how Visual Studio Team System 2010 will enable teams to be more agile. In this demonstration she shows the new Agile planning worksheet for Excel which can be used to easily balance resources, manage your backlog, and generate ad hoc reports.

Enterprise Project Management with Visual Studio Team System 2010

Ameya Bhatawdekar, a program manager for Team Foundation Server, took a few minutes to take us through the end-to-end storyboards for how Team Foundation Server 2010 will integrate with Microsoft Project Server to enable true enterprise-wide collaboration. Note that this is not a demo of working software (yet), but it's the next best thing - a detailed storyboard walkthrough of mocked-up screenshots.

Requirements

Requirements Management and Traceability with Visual Studio Team System 2010

How can you ensure that a requirement has been sufficiently tested? How do you track the work that goes into a specific feature? How much work is left to do before a feature is completed, and how does that feature relate to bigger scenarios or user stories?
Siddharth Bhatia, a senior group program manager for Visual Studio Team System, takes us through an end-to-end example of how Visual Studio Team System 2010 will help an organization manage their requirements throughout the lifecycle of a software project.

Architecture

Architecture without Big Design Up Front

Microsoft Visual Studio Team System (VSTS), code-name "Rosario" Architecture Edition, introduces new UML designers, use cases, activity diagrams, sequence diagrams that can visualize existing code, layering to enforce dependency rules, and physical designers to visualize, analyze, and refactor your software. See how VSTS extends UML logical views into physical views of your code. Learn how to create relationships from these views to work items and project metrics, how to extend these designers, and how to programmatically transform models into patterns for other domains and disciplines.

"Bottom-up" Design with Visual Studio Team System 2010 Architect

Suhail Dutta, a program manager on the Visual Studio Team System Architect team, gives us a demonstration of the "bottom-up" design approach which will be possible with the Visual Studio Team System 2010 Architect product.
With "bottom-up" design, you can quickly reverse engineer an existing code base to construct models and examine relationships between pieces of code. Suhail also shows off some of the new UML designers coming in Visual Studio Team System 2010.

"Top-down" design with Visual Studio Team System 2010

"Top-down" design is an approach that the Visual Studio Architect team is enabling with their upcoming release, Visual Studio Team System 2010. In this "humanized screencast" we asked Mark Groves, senior program manager, to show us a demonstration of the new UML designers the team is building and how this can be applied to a "top-down" approach when building software.

Development

Agile Development with Microsoft Visual Studio

Visual Studio has built-in tool support for agile practices such as Scrum, XP, and others. The next version adds practices like test-driven development, continuous integration, and single product backlog. See how these can be applied at scale and across geographies.

Web Development and Deployment with Visual Studio 2010

Welcome back to another Visual Studio 2010 and .NET Framework 4.0 Week video. In this latest installment, we catch up with Vishal Joshi, Senior Program Manager on the Web Development Tools team.  In this video, Vishal shows us what is being done in Visual Studio 2010 around web development and deployment. Covered are topics like JQuery support, HTML code snippets, better Intellisense, and a whole slew of new features around web deployment.

An early look at Team Foundation Build 2010 with Jim Lamb

In addition to being one of the nicest guys I know, Jim Lamb also knows a thing or two about build automation. Jim is the program manager responsible for the Team Build capability of Team Foundation Server. Team Build was one of the biggest areas of improvement for Team Foundation Server 2008, but that hasn't stopped the team from doing even more landmark improvments in Team Foundation Server 2010.
Jim shows off how Team Build 2010 will take advantage of Windows Workflow, build agent pooling, distributed asynchronized builds, and two new types of build called "buddy builds" and gated check-ins.

Branching and Merging Visualization with Team Foundation Server 2010

Is your source control branching out of control? How much time have you wasted trying to discover which branches your code changes have been merged into? What are the code-level differences between your main, test, and production branches? Branch visualizations to the rescue!

Test

New Web Test Debugging Features in Visual Studio Team System 2010

In this video Ed Glas shows off new Web test debugging features in Visual Studio Team System 2010, including Search in playback, view recording log, jump to Web test, and Add Extraction Rule from Playback.

10-4 Episode 18: Functional UI Testing

In this episode of 10-4 we look at a new type of test coming in Visual Studio Team System 2010 known as the coded UI test. Coded UI tests can be created to automatically navigate through your application's UI, which in turn can be used to verify that the paths your users might take through your application are working properly. You can also add validation logic along the way to verify the properties of objects within the UI. Much like unit tests can quickly surface regressions on a method or function level, coded UI tests can bring the same level of rapid automated testing capabilities to the UI layer.

UI Automation Testing with Visual Studio 2010

Just playing with some of the new Testing features in Visual Studio 2010 and thought people might be interested in the new interface for Camano and a new feature for CodedUI Tests...pulling the automations strips directly out of TFS!

Lab Management coming to Visual Studio Team System 2010

Today at TechEd Barcelona, Jason Zander announced that Visual Studio Team System 2010 will feature a brand new Lab Management capability to help organizations raise the bar on software quality. Lab Management will integrate with the rest of the Visual Studio to help testers more easily test a variety of configurations in a virtual lab environment, and help developers more easily repro bugs by delivering snapshots of those virtualized environments after bugs are discovered. I had a chance to sit down with Ram Cherala and Vinod Malhotra to get an in-depth look at how this will work.

Microsoft Visual Studio Team System: Leveraging Virtualization to Improve Code Quality with Team Lab

Would you like to test fixes in a production-like environment before checking them in to source control? The Visual Studio Team System (code name "Rosario") release of Team Lab improves productivity and quality while reducing the cost of building and testing world class products. Learn how Team Lab provides a fast and easy way to create a test environment and tear it down, target specific test environments, and take snapshots of an environment for easy deployment.

Microsoft Visual Studio Team System: Software Diagnostics and Quality for Services

In this session we present processes and tools from the upcoming Visual Studio Team System code name "Rosario" release and Microsoft Research and show how we deliver on quality, scalability, and experience goals for the new class of applications that demand rich UI, service consumption, and frequent release.

Manual Testing with Visual Studio Team System 2010

Naysawn Naderi takes us through manual testing in Visual Studio Team System 2010. Naysawn shows off how the manual testing capabilities allow not only for better authoring and execution of manual tests, but can also be a tool to help automate portions of manual tests as well. Finally, Naysawn shows how to turn a manual test into a coded test which can then be fully automated.

Historical Debugger and Test Impact Analysis in Visual Studio Team System 2010

Are you tired of constantly setting breakpoints to hone in on a pesky bug? How would you like to be able to step "back in time" through your debugger? The Historical Debugger in Visual Studio Team System 2010 promises to revolutionize your debugging experience. Habib Heydarian takes us through a demonstration of just a few of its capabilities.
But wait... there's more! Habib also shows us the new Test Impact Analysis feature his team is working on. With Test Impact Analysis it's possible to determine which of your tests will be... well... impacted by the code changes you're making! Not only does this mean that your unit test suite can run more quickly, but it can also lead to better testing and fewer bugs in software projects.

Automated User Interface (UI) Testing with Microsoft Visual Studio Team System 2010

Come hear about the new Visual Studio Team System 2010 tools and APIs for helping test a broad range of UIs that can consist of Winforms, AJAX, and Windows Presentation Foundation. See how to use Team System 2010 to ensure UI and application quality.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Wednesday, June 10, 2009 10:17:52 AM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Friday, June 05, 2009

RJR Consulting design thumbnail Over the past few weeks I’ve been working on a web site for my cousin. This was a ground up redesign of his company web to replace the old SEO unfriendly Flash site.

Normally as the developer I only get someone else's completed design; I just do the implementation but I wanted to see if any of my recent photography and image editing skills were transferrable.

The site was first designed in Illustrator/Photoshop with a fair number of round trips to agree the final design. Implementation was done in Expression Web using HTML, CSS and some JQuery for interactivity. There are no <tables> anywhere on the site; it is a pure CSS layout. I also used Expression Web "Dynamic Web Templates" for the master page layout as the server where it is currently hosted does not support .NET.

Some things I've learned from doing this:

  • JQuery makes life easy and Glimmer makes it even easier
  • IE8 standards support is fantastic but way to many still use IE6/7 to use it
  • You need to have an idea of what is possible in HTML before letting loose in Photoshop
  • background-image is your friend

As it is all pretty static at the moment there are plans to move to a site that supports .NET so I can add some more features, in particular a blog and news feed. I would like to try this in ASP.MVC to get some experience there.

So go to RJR Consulting and have a look around (especially if you need telemarketing, telesales or similar services).

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Friday, June 05, 2009 7:01:19 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Sunday, May 24, 2009
I'm not a morning bee
I'm not a morning bee, uploaded to Flickr by James Snape.

Canon 400D, Canon EF 100mm f/2.8 Macro lens - 1/2 sec, f/11, ISO 200

I took a couple of macro shots a few weeks ago but all the insect images went in the bin as they were out of focus or blurred. This bee was still in the process of warming up for the day so didn't move about much.

My only issue with the image is that the conversion to Jpeg hasn't gone that well - in particular it looks over sharpened and the colour of the catch light has gone quite blue compared to the original Photoshop image. I may end up reprocessing it to see if I can fix the flaws.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Sunday, May 24, 2009 12:28:08 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Saturday, April 25, 2009
Talbot Heath Fire
Talbot Heath Fire, uploaded to Flickr by James Snape.

Canon 400D, Canon EF 24-105mm f/4L IS lens - 1/250 sec, f/5.0, ISO 100

We had a rather large fire on the heath behind our house last week. It Bournemouth Echo article as it's the second time in a month there was a fire here. Unfortunately there is not much left now which is a shame because it's a rare bird breeding ground.

The Watcher, Plate 3I should have taken my long lens; this shot was about as close as I could get due to a police line around the fire.

I don't think they liked the fact I was taking photos though as it got my name and address in their notebook.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Saturday, April 25, 2009 10:58:54 AM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Tuesday, April 21, 2009

If you are a developer and tasked with making sure the corporate rollout of Internet Explorer 8 goes without issues, you will find the following links of use.

The main page of interest to you is the IE Developer Centre where you will find all the information and links collected together.

Also:

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Tuesday, April 21, 2009 3:11:37 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Thursday, April 16, 2009

Every data warehouse needs a date dimension and at some point it needs to be populated. Most use some sort of a SQL script that loops though the dates and add rows to the destination table but this is pretty slow to execute. You might even try cross joining a year, month and day temporary tables to produce a set based solution but don’t forget to filter out the illegal days.

I prefer to fill my date tables by generating the correct stream of values from a SQL Server Integration Services script source component. This has a number of benefits:

  • It executes very quickly
  • The data can be bulk loaded
  • CultureInfo supplies the correct translations of day and month names
  • It is easy to add custom columns such as fiscal years and quarters

I haven’t wrapped this in a pre-compiled component as it is so easy to do in script from. Also, I haven’t got around to generalizing the fiscal date offsets for different companies so they usually have to be custom coded.

Script Component Type Dialog

First drop a “Script Component” onto your Data Flow.

Select “Source” as the Script Component Type and click OK.

Then double-click the newly added component to edit the properties.

Note that you need to add the correct output columns before adding the script or else it won’t compile.


Script Source Outputs

I’ve renamed the output here to “Dates” to help further down the Data Flow.

Click the “Add Column” button to add new columns as show here. Note that I’ve also changed the data type of each column to match my source table. It required casts in script but it’s easier than conversions in the data pipeline.

Finally go back to the script part of the dialog and click the “Edit Script” button to launch Visual Studio for Applications.

In the resulting window, add your code to generate the date stream to the CreateNewOutputRows() function.
The general form is of:

var output = this.DatesBuffer;  // Get the output buffer

while (/*loop though your dates*?)
{

output.AddRow();

// Set the various column values e.g.
output.CalendarYear = date.Year

// Increment the date
date = date.AddDays(1);
}

The full script is in the attached sample package where I’ve also added a script destination that does nothing with the data. Attach a data viewer to see what output is generated.

Date Data

From here you can manipulate the data, and pipe it to your dimension table from within the pipeline.

DateSourceSample.zip (27.08 KB)

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Thursday, April 16, 2009 7:52:34 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
DateSourceSample.zip (27.08 KB)
# Tuesday, April 07, 2009

This is the second part in a series of data warehouse presentations I’ve been giving. This one concentrates on the physical design of the underlying SQL Server database, some information on the SQL Server Fast Track Data Warehouse and finally a one slide guide to tuning the storage engine performance. The tuning slide is only really a first step and I plan a more in depth session some time in the future.

Also I apologize, this deck is a little word heavy. I prefer more graphics in a presentation but there is a lot of specific guidance to list here.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Tuesday, April 07, 2009 12:02:55 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0] Trackback
# Friday, March 20, 2009

As a photographer I don’t really agree with this amendment which came into force on the 16th February (bold text mine):

A person commits an offence who—

  1. elicits or attempts to elicit information about an individual who is or has been—
    1. a member of Her Majesty’s forces,
    2. a member of any of the intelligence services, or
    3. a constable,
    which is of a kind likely to be useful to a person committing or preparing an act of terrorism, or
  2. publishes or communicates any such information.

A person guilty of an offence under this section is liable—

  1. on conviction on indictment, to imprisonment for a term not exceeding 10 years or to a fine, or to both;

Now what about Street View? I won’t publish this picture but this link should take you to a page on Street View.

Note that Google may edit the image after I’ve posted this but right now the left hand policeman is clearly visible and happens to be standing outside the residence of a well known politician.

Does this fall foul of the Terrorism act?

[Edit: that was quick, they have already removed the image but you can move up and down the street to still get a clear image of the other policeman.]

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Friday, March 20, 2009 1:24:16 PM (GMT Standard Time, UTC+00:00)  #    Comments [0] Trackback
# Tuesday, March 17, 2009

I recently gave a presentation at our community event on Dimensional Modeling. It covers the logical design and user model phase of a data warehouse project.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Tuesday, March 17, 2009 6:17:24 PM (GMT Standard Time, UTC+00:00)  #    Comments [0] Trackback
# Tuesday, February 24, 2009

One of the big problems I have with some new customers is the knowledge gap between where they are and where they need to be in order to be successful. Given that people can read much faster than they can receive presented information it makes sense to have a reading list.

I’ve blogged previously some Analysis Services resources but new and better content is appearing all the time. In addition to that article here are my current reading recommendations for anyone planning a BI or Analysis Services project. Know this lot backwards and you should have a good head start.

Everyone

SQL Performance Tuning with Waits and Queues

Analysis Services Performance Guide

Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services

The Data Loading Performance Guide

Report Server Catalog Best Practices

Reporting Services Performance Optimizations

New Data Warehouse Scalability Features in SQL Server 2008

Scaling Up Your Data Warehouse with SQL Server 2008

Architects and Developers

OLAP Design Best Practices for Analysis Services 2005

Best Practices for Data Warehousing with SQL Server 2008

Analysis Services Processing Best Practices

Top 10 SQL Server Integration Services Best Practices

Top 10 Best Practices for Building a Large Scale Relational Data Warehouse

Operations and Support

Storage Top 10 Best Practices

SQL Server Best Practices

IIS Performance Tuning

Resolving Common Connectivity Issues in SQL Server 2005 Analysis Services Connectivity Scenarios

In fact pretty much anything on the sqlcat website is pure gold.

This posting is provided "AS IS" with no warranties, and confers no rights.
posted on Tuesday, February 24, 2009 10:15:39 AM (GMT Standard Time, UTC+00:00)  #    Comments [0] Trackback