Tuesday, October 17, 2006

Failure - Data Source Does not Exist..! - Update

I got asked to post the demo project I had mentioned in the original post a little while ago but unfortunately I had lost it. Luckily I was able to pull enough from memory to build it again and make it available for you pleasure. Whilst not embedded with premier coding standards it certainly provides some direction on how this kind of thing is achievable.

The project is here, any questions then mail me at the usual address.

Apologies for the delay :)

Sunday, October 15, 2006

This Week I am Mostly Listening To...



Morning Runner - Wilderness is Paradise Now

Thursday, October 05, 2006

A Note From The Editor....

Just a little clarification here. I am not a professional Dolphin trainer nor am I an amateur one either although I did once swim with a dolphin in the Florida Keys and found it to be a quite unfulfilling experience as the Dolphin was in a pool that I would consider to be the Dolphin equivalent of one of those Japanese capsule hotels.

The freelance dolphin trainer title comes from a time in my youth when life consisted of doing the things from this song below.

Thank you for your attention in this matter.

Editor out..!

Next Gen Blogging

I really wish I could blog more but I do find that the things I want to blog about do tend take a large chunk of time so when things come along that make blogging a bit more streamlined, I' first in the queue. Firstly the blogging support in Word 2007 is pretty good but I found it a little restrictive on occasion so now I've switched to Windows Live Writer.

I'm not going to write anything about it but point you in the direction of this, where you can get it, and the video below. Direct from YouTube and embedded with a couple of clicks from Live Writer.

Enjoy....

Wednesday, October 04, 2006

Simple SSIS Logging - Part 3, Bad, Bad Rows..!

It’s all well and good knowing that there has been a failure in your package which has lead to a bleary eyed support technician getting phoned at 3 in the morning but sometimes it’s not really necessary to get Barry (we’ll call him Barry for the purposes of this post) out of bed. Your package may be bullet proof and work perfectly but the nature of data is that it will change and can quite often come back and bite you on the backside.

Take for example early arriving facts. When you start integrating a number of systems it is possible for things to start happening that don’t conform to the world according to the system owner. Take the example of an in-house order system operating completely separately to the companies CRM system. During the analysis phase of the data warehousing project it was determined that CRM was the source of Customer ID therefore that’s where customers get created and that’s where customers get maintained. But a new customer phones up and makes an order before the customers details are completed in the CRM mechanism due to one thing or another. Your data warehouse is quite likely to be collecting that fact row before the relevant CRM data has been correctly populated and passed on to the customer dimension. So what happens in your SSIS package when looking up the Customer key for that dimension, well not a hell of a lot.

To be fair there are numerous techniques for dealing with early arriving facts. Some involve the population of the Customer ID and retrieval of the key so that when the correct dimension member arrives in the dimension table, its attributes are populated automatically. The problem is that things aren’t always as clear cut as that. You have managed to capture the early arriving fact data and correctly allocate it to members in the Customer dimension but when board reports run that specify a particular customer type the data is incorrect, the numbers don’t roll up and for 24 hours you could be running around like a headless chicken trying to figure out where the missing data has gone, oh that’s until the next day when the customer dimension member is correctly updated and things mysteriously fix themselves .

So where does logging fit into all of this? Well knowing that we have an early arriving fact would be a good idea. So we have to log the early arriving fact and let the data pass through to allow it into the fact table.

This looks quite simple but when you’re trying to handle a number of dimensions and any of these could have early arriving facts then you can get have a lot of data destinations. In addition to all this there is also the fact that each package could have a different error table at each dimension lookup making logging, which should really be centralised and in a defined structure, a little reactive and difficult to deal with. Dealing with this may not be as difficult as it seems though and for the first time I’m going to mention a customer SSIS task that has been written with this in mind. A colleague of mine has written a component for just this type of situation. Whilst I would have liked to have completed all 3 parts of this Simple SSIS Logging piece without referring to any external code, this really does make life a lot easier. The full source code is available on the site and I recommend its download.

The Xmilfy component (to give it its officially christened name) will turn a series of columns and convert them into a single XML column that has some obvious benefits. In the above example the data is then written to a single table in my SSIS logging database which has a data type of XML allowing the data to be viewed in a number of ways. Extra information can be added into the data flow that specifies where the error occurred, the number of rows etc, again creating detailed and useful reporting. Want to know the number of early arriving for a particular package or even a particular dimension, easily done using the data in the table. Finally a warning can be raised using script to inform Barry, remember him, that some data problems occurred but now it’s a warning and can be handed over to the department in charge of wielding the large stick that is used to stop sales people logging sales before the customer is correctly created. Furthermore the system can be automated to alert said department taking Barry completely out of the loop. But wait a second, Barry is now worried about losing his job but luckily he’s been learning SSIS in his spare time.

It’s ok people, Barry is fine…!

This Week I am Mostly Listening To...



Boy Kill Boy - Civilian