Tuesday, October 09, 2007

SAP have BO....!

Oooooh, the smell..!

I couldn't resist the almost register like title for the post....

It appears after many many months of hearsay and conjecture someone is finally going to stump up the cash and buy Business Objects. There's been a lot of talk around this area especially with Hyperion being absorbed by Oracle and further rumors about the future of Cognos.

Microsoft seem to be content with buying the smaller players and partnering with the similar smaller vendors to compliment their own technologies and now they've gone their own route for budgeting and planning it seems very unlikely they will do anything else but carry on that trend.

It will be interesting to see what effect this has, if any, on the close ties that SAP and Microsoft have had recently. We shall see.

Wednesday, September 26, 2007

This Week I Have Mostly Been Listening To...

AirTraffic-FracturedLife

Air Traffic - Fractured Life

Technorati tags: , ,

Developing with MDM

Technorati tags: , , ,

There has been some recent announcements on Microsoft's acquisition of Stratature that have begun to solidify their approach and roadmap for Master Data Management. The MDM tools, as most people had previously guessed, will be very much aligned with the Office toolset and made part of the SharePoint. Personally I think the key thing about this is how this data will be accessed in my day to day job pulling data from here and putting there for user consumption.

Jamie Thompson has a video on YouTube (embedded below) detailing how to consume web services in SSIS 2008. Couple this with Bulldog's (Microsoft's name for absorbing Stratature's EDM+ product) ability to expose the master data and meta data as web services and all kinds of pre-built SOA components become possible.

Although possible with the existing set of tools in SSIS, I would be surprised if a data source component didn't appear at some point for SSIS 2008 that was specifically designed to retrieve master data from Bulldog. After the constant battles I've had getting master data out of organisations due to the cost of managing and maintaining it, having a tool that looks after the work flow and provides simple generic interfaces to the data will be very beneficial indeed.

Bring on the CTP next year....

Tuesday, September 04, 2007

The Last Month or so I Have Been Mostly Listening To...

Our Love to Admire

Interpol - Our Love To Admire

The Young Knives Are Dead... and Some

The Young Knives - The Young Knives Are Dead... and Some

Girls and Weather

The Rumble Strips - Girls and Weather

Because of the Times

Kings of Leon - Because of the Times

Technorati tags:

Extended Absence.....

After what appears to have been an eternity of non posting (which coincidently seems to occur every August, holidays, V festival etc.). I shall attempt to make up for lost time.

So lets ease into this gently with music posts first....

Technorati tags: , , ,

Friday, June 29, 2007

Sorry it's been a bit quiet...

The lack of posts recently is down to my laptop being in the repair shop.....

No fault of mine just general wear and tear. As soon as I get it back then I'll have some stuff on SQL Server 2008, more MDM bits and some interesting things I've been up to with SharePoint 2007, not forgetting the things that i've been listening to. There's quite a lot of those as we enter festival season and yes I will be making the traditional trip to the V Festival in Chemlsford if anything just to see McFly...! (If you know me you know the truth).

Laters

Steve

Friday, June 08, 2007

Just when I was about to carry on talking about MDM...

...Microsoft go and announce this.

http://blogs.msdn.com/knight_reign/archive/2007/06/08/microsoft-completes-stratature-acquisition.aspx

Looks good for an end to end Microsoft architecture but it will be interesting to see what happens when it's working outside of its comfort zone.....

Technorati tags: , ,

Tuesday, June 05, 2007

What do they say about boys and their toys....

Well now  I have my new version of writer I'm

posting like a demon....

Anyway let's get up to speed with some SQL Server 2008 web casts coming in July...

SQL Server 2008 LiveMeeting Schedule

Analysis Services - Dimension Design 06/12/07 11:00 am PDT

Change Data Capture 06/13/07 11:00 am PDT

Star Join Query Optimizations 06/19/07 11:00 am PDT

Table Valued Parameters 06/22/07 11:00 am PDT

Declarative Management Framework 06/26/07 11:00 am PDT

MERGESQL Statement 06/29/07 11:00 am PDT

Technorati tags: , ,

Windows Live Write Beta 2

I've blogged about it in the past and it's still a great little tool. You can get it from here.....

http://writer.live.com/

Technorati tags: , ,

SQL Server 2008...

The June CTP is out....

Get it while it's hot....!

https://connect.microsoft.com/SQLServer/content/content.aspx?ContentID=5395

Wednesday, May 23, 2007

This Week I am Mostly Listening To...

The Rakes - Ten New Messages

The Evil That is Master / Meta Data - Part 2 (or the one where Steve talks about Ireland..!)

In the last post on this subject I talked about some of the key attributes of master data and meta data management and how it is intrinsically synched with what I call the Information or Data lifecycle. Now I would like to elaborate on this and identify some of the advantages that investing in this kind of strategy can provide.

Embarking on corporate wide gathering of all things data requires investment at all levels. Time, effort, money and most importantly commitment are essential. But ensuring a business receives any kind of ROI (return of investment which in the early part of my career I thought meant Republic of Ireland, hence the title) before that kind of commitment takes place can prove daunting and a little difficult to say the least. Let's forget the cons for a moment and look at some examples of the key advantages a data and information strategy can provide.

  • Everyone on the same page.
  • Information means the same thing throughout the business.
  • Reduced cost of new reporting and / or analytical requirements.

This doesn't look like a very extensive list and to be honest if someone presented me with a pitch like this I would be showing them the exit pretty rapidly but when you further examine the nature of each of these bullets you see that they are rooted incredibly deeply in practically all business processes and systems in place within an organisation. Key to the whole concept is that meta data and master data are not only for use within reporting systems. It's just that any projects that tend to drive this kind of requirement are also implementing some kind of reporting mechanism.

Lets take a step back and look at a simplified implementation of a number of reports. First we gather the requirements for the reports which would be based upon an existing set of data, possibly sheets of paper possibly a database storing transactions. Then the nasty business of performing analysis on said data, conforming it into your existing dimensional structure or creating new dimensions from scratch. On defining the model from which you will base your reporting you can finally start building reports.

So how could we improve this process and reduce the time taken to turn around a reporting requirement. First having some degree of knowledge of the system prior to a reporting requirement coming along would be advantageous but that's not the way the world works. Looking at a single report as a deliverable we would need to understand where the constituent data is sourced from. The report, for example, has customers, geographical breakdown, product type, number of orders and order value. Very simple but already pulling data perhaps from CRM, product catalogue and ordering systems.

When building a picture of the data held within the company it is very important that ownership is established. Who owns the customer data? Who is responsible for maintaining the product catalogue? These are the people that own these data elements within the organisation and are therefore ensuring the quality of not only the data in their own systems but also the reporting that is based on this data.

The point to this is that data quality needs to come from the top down. BI projects are generally just the the catalyst for this but should also be used as means of improvement in the source systems. Too often has data cleansing been hooked on to the back of a BI project and weighed it down with responsibility that should lie elsewhere.

Ok enough of this business type talk of responsibility and stuff. Next time I'm going to go into what master data and meta data are actually made of.

Unit Testing SSIS

SSIS packages are almost like a mini application in their own right. On most occasions there is one or more input items of data that may consist of either a single variable value or an entire data set. This will follow a transformation or validation process before a final output that could again be in a number of different formats.

Unit testing these packages should involve minimal change to the structure or behaviour of the package itself. The risk of influencing the code behaviour through the testing process is as great as an incorrect deployment or missing a particular testing scenario.

The most important factor in the process of testing a package is to understand how it will react in controlled circumstances and be in a position to test the success of the anticipated result. Testing this using the natural output of the package, for the reasons discussed previously, will provide the most robust results.

Due to some of the current debug limitations of SSIS and taking into account the need to keep the package structure and design static; it is only really possible to effectively test the control flow of a package whilst remaining ‘hands off’.

Let's look at a simple package example;

The same type of output would be taken from the other standard control flow tasks. The execute SQL task would have an initial state, a completion state and a resulting success state based on the other outputs. The initial state of n rows in the destination table before and 0 rows after the step has executed. This is measured by examining the row count in the table before and after execution and comparing the value with the expected result. For the archive file system task all states would be measured using a different mechanism and so on and so forth.

Essentially this means that whatever the task there may be numerous methods of gathering the data necessary to confirm whether the test process has been successful. Simplifying the process of measuring the results of a test would make applying a standard testing mechanism far easier to implement.

Currently packages provide the ability to perform some manual logging as I've posted about in past. This can be used to establish whether tasks have completed successfully or not but where a measurement is needed to confirm this, this type of logging is lacking. For example, truncating a table will not provide you with row count confirming the number of rows affected or the subsequent amount of rows left in the table whilst a delete statement would. It would not be wise to change all truncates to deletes to allow this information to bubble up to the logging process and therefore allow capture of the state of the task before and after execution.

There are a number of different ways of trying to perform strict, robust SSIS unit testing of which I've generalised them into 3 options.

First is to create a more robust testing methodology that has processes particularly for unit testing SSIS packages based on the control flow. The success criteria should be based on use cases defined at the package design stage. Testing each of the steps within the control flow of the package with documented pre-execution criteria, expected output and the subsequent result.

This does not lend itself to any kind of automated testing approach and would involve stepping through the package control flow and logging the results.

The second option allows for the automation of tests and the capture of results but would require another application to control the execution of the process and establish pre-execute criteria in addition to measuring the post-execute result.

Using the unique identifier for each of the control flow tasks, during the pre execute and post execute process of each step a lookup would be made to either a database or configuration file containing the method call that would initiate the processes necessary to prepare the execution and subsequently validate and measure the success on post execute.

A change like this would mean integrating a procedure call, perhaps using the CLR, to execute such tasks based on an internal variable indicating that unit test, or debug mode was enabled within the package. Whilst providing a number of advantages in automated testing and capture of the test results there would be a great deal of work required in the preparation of the test. This would all have to be completed in addition to that suggested in the first option as the pseudo code necessary to design each of the test criteria would be based on the use cases defined at the package design stage.

The final option would be to remove the more advanced mechanism from option 2. The pre analysis and use case definition would still be required but in this option additional test scripts would be placed in the pre and post execution events of the package. This would mean embedding code into the package that would only be used during unit testing that could possibly be switched off using a similar variable to that in the second option.

Thought it would be possible to automate a great deal of the testing for the package it would mean changing the structure or behaviour of the package directly and increase the danger of introducing problems with script tasks that have previously been experienced on 32bit to 64bit transitions.

So there you go. There are ways of doing SSIS unit testing. There are even ways of automating the process but it's not a quick win. It doesn't remove the need to establish good formal testing strategies and it certainly isn't going to appear out of a box, well not soon anyway.

Wednesday, May 16, 2007

This Week I am Mostly Listening To...

The Pigeon Detectives - Wait For Me

Tuesday, May 01, 2007

This Week I am Mostly Listening To...

The Maccabees - Coulour It In

Tuesday, April 03, 2007

The Evil That is Master / Meta Data - Part 1 (or the one where Steve talks about socks..!)

Master data and metadata. A subject close to my heart due to it's significant importance in what I call the data lifecycle. Data? Lifecycle? What on earth is he talking about now, I just wanted to get Oracle talking to SSIS? Well let's go a little off subject here and use a bit of a euphemism.

Take something simple that I think we all learnt at school, the water cycle. This is the continuous movement of water as it shifts location and state from ocean to atmospheric to ground water. Now I liken this to the way data moves through an entity whether it be an organisation or group of systems.

A good example of this is a common scenario in financial reporting. An accountant (that's the cloud up there) will read their profit and loss report for a particular department and use this as to calculate the following years budget or forecast. These estimates will then be entered into the budgeting and planing system (that would be the mountains, more likely though it's Excel :). The budget and forecast are imported into the data warehouse where the profit and loss report (the ocean perhaps?) is generated which is read by another accountant looking at the companies performance, who........ ad infinitum.

A very typical example but it demonstrates the fact that the behavior of data within an organisation is very organic and in a constant state of flux. Just because the original piece of data is sitting in a table somewhere it doesn't mean it hasn't evolved into a different beast elsewhere with different properties and meanings. Simple as it sounds, this makes life a little complicated when you add influencing factors such as SOX (Sarbanes Oxley) compliance that requires the demonstrability of internal controls. In BI speak this could be someone changing an attribute of a dimension member and proving who did it, when and why. One tiny change which to a developer may be minor but to a CEO moves them from the red to the black, exactly the kind of thing SOX tries to stop.

Now all this talk of oceans, cycles and socks is all very good but doesn't bring us any closer to knowing what the hell to do about managing master and meta data. Ok, lets break down some of things I've mentioned in to some key bullets.

  • Dimension Management
  • Compliance
  • Process Flow

This list identifies some of the major reasons for having and requirements of any meta data and master data management mechanism.

In the next part I'll cover the these elements in more detail and how they can contribute to a more streamlined data strategy.

Monday, April 02, 2007

This Week I am Mostly Listening To....

Arcade Fire - Neon Bible

If an Alien Dropped Down, Right Here, Right Now....

I have just had the displeasure / pleasure (please delete as necessary) to have spent a little time in Barcelona witnessing the English football teams victory over Andorra.

According to Wikipedia Andorra is a co-principality with the President of France and the Bishop of Urgell, Spain as co-princes.

According to the majority of the traveling supporter base of the England team, Andorran's are either taxi drivers, estate agents or butchers with the odd police man thrown in for good measure.

So where does the alien in the title come in? Well, the alien that dropped into the Estadi Olimpic on Wednesday night would have probably thought;

  1. The human population is 98% male
  2. The human population is 98% hairless
  3. The human population is unable to construct sentences coherently
  4. The human population are masochists.

For the love of God, lets hope the aliens go to a Brazil game instead...!

Friday, March 16, 2007

This Week I am Mostly Listening To...

The View - Hats Off to the Buskers

rsInternalError - Now This One's a Doozy...!

If you're running Reporting Services and keep getting random rsInternalErrors that may or may not return errors in the logs concerning things like "There is no data for the field at position n", before you start to rip apart your reports looking for bugs check the make of the processors on your server.

There is a little problem-ette that manifests itself as these kind of errors when you have AMD processors in your machine. We have 2 servers on a clients site that have the 64bit Opteron chips in that were both demonstrating these errors with different reports, data sources and pretty much anything you can imagine. After a little hunt around the information hyper global mega net we found some newsgroup postings points to this Microsoft knowledge base article.

http://support.microsoft.com/default.aspx/kb/895980/en-us

Adding the /usepmtimer switch in the boot.ini file appears, for now, to have cured the problem that was causing very simple reports to fail 20-30% of the time on an under stressed production server.

After my recent problems with SSIS 64bit this was the last thing I needed but suffice to say my cup of joy was overflowing as a dev server that pretty much gave me an rsInternalError on command has worked flawlessly for several hours. If anything changes I'll be sure to mention it.