Tuesday, November 27, 2007

This Week I Have Been Mostly Listening To...

The Wombats Proudly Present..A Guide To Love, Loss and Desperation

The Wombats - The Wombats Proudly Present..A Guide To Love, Loss and Desperation

Wednesday, October 24, 2007

This Week I Have Been Mostly Listening To...

Overpowered

Overpowered - Roisin Murphy

Technorati tags: ,

Organising SSIS

I think this has probably been blogged about and documented elsewhere to death but it's well worth posting something about (even as a reminder to myself).

SSIS package configuration is great but as always, as people use something heavily, its usage gets more sophisticated and so do the problems. One thing I like is to be able to centrally log and also centrally get configuration values but to do this I need to be able to support multiple environments (dev, test, etc.) and also multiple applications (the same variable used in different packages). This is where I've found the configuration to be little limited so in true developer fashion we came up with an alternative configuration mechanism. I won't go into the details but effectively we get all configuration values from a call to a database.

The only problem with this is that I want to be able to tell the package where the configuration database is, what environment I'm executing in or the application name I'm running under but I can't do that with hard coding a config file into the package. Of course there are alternatives. A number of server variables or registry entries but really I just want to do this the once and have control over everything else.

In my perfect world I would want to structure my data in something similar to this;

image

This contains pretty much everything I need all in a single location. I can have multiple configuration files for different SSIS based systems, a place for source files and if I wanted to drop in XSD files I could create a new folder or just stick it in a suitable existing location. But this is when these little issues creep in.

How do I tell my package where the SSISRoot folder is? I don't want to use configuration files as that defeats the object of having a single location where I keep my configuration files, very chicken and egg. So what alternatives are there? An environment variable can be used to point to the config file location but then I need an environment variable for every config file I have once again defeating the object. What about creating an environment variable that points to SSISRoot, that would deliver what I need. I tried this. I created the variable and called it from Start>Run %SSISRoot%, it worked and I get the folder location that I wanted. The next thing to do was to put "%SSISRoot%/Config/SSISConfigFile.dtconfig" into the path for the file however that had little success, and when I say little I mean none whatsoever.

So I'm back to square one. Hard coding the package config file into the packages. This is one of those issues I would be grateful to hear any clever workarounds that anyone has so please let me know as this is getting increasingly frustrating.....!

Technorati tags: ,

Thursday, October 11, 2007

UPDATE to Multiple Pick List Filters in PPS 2007? Get RTM, Quickly..!

Technorati tags: ,

Yesterday I was over the moon with joy after finding that the RTM version of PPS solved my little issue. How wrong I was as performing the upgrade on another server presented me with exactly the same problem.

After some mad panicking and searching for a solution we discovered that the version of PAS on the PPS virtual PC was different to the version on our server. Another little hunt around presented us with this....

Microsoft ProClarity Analytics Server 6.3.2209 Hotfix

...which subsequently lead us to...

A hotfix rollup package is available for ProClarity Analytics Server 6.3

...and tucked away on there was this little gem...

Issues that this hotfix rollup package fixes

This hotfix rollup package fixes the following issues:

blah

blah

blah

When you run a query in Microsoft SQL Server Analysis Services, the query takes a long time if multiple parameters are specified. Additionally, the query may time out in Internet Information Services (IIS).

Hey presto it all behaves itself now.

To summarise, you don't have to install the RTM PerformancePoint Server to get rid of the poorly performing MDX, just patch PAS but I would still recommend getting it for things like better performance on the UI and the fact it now supports integration of Reporting Services on SharePoint.

Technorati tags: , , ,

Wednesday, October 10, 2007

Multiple Pick List Filters in PPS 2007? Get RTM, Quickly..!

PerformancePoint Server 2007 RTM has been up on MSDN subscriber downloads for a while now, well several days anyway, so I would imagine there aren't too many people who haven't got hold of it who are doing some serious mucking around in PPS.

If you're one of those people who are on the CTP but don't have access to the RTM then this may be of some interest.

As some of you may be aware one of the report types you can embed into a PerformancePoint dashboard is a Proclarity Analytics Server Page. This effectively provides the user with a Proclarity briefing book allowing some interactivity and as with almost all report types within PPS filters can be applied externally by PPS allowing say a PPS analytics grid, PAS page or reporting services report to all be displayed with the same parameters / dimension members being passed through.

The particular filter type that this post is concerned with are multi-selects. This is a typical dimension selection technique you've seen in plenty of tools before so I won't teach you to suck eggs on that one. Anyway, to cut a long story short after the dashboard has been published it's possible to preview the dashboard with your embedded PAS based report and filters.

Select a single member from your filter and apply and the report return pretty quickly, not lightening fast but remember this a CTP and debugging is switched on etc. so that has to be taken into account. Select 2 members from the filter and when the report is say performing a non empty on a dimension with members numbering in the thousands, it takes a little longer than it should, in fact uncomfortably longer.

Profile the poor performing MDX and here's what you get....

WITH

  SET [<##<SET!smchugh!{E3EF8F62-CD2B-422D-881F-ABAFB4AA3E16}>##>]

    AS '{[Product].[Product Categories].[Subcategory].&[31]}'   

  SET [<##<SET!smchugh!{844FB37C-1DA6-4536-B7A4-297650572576}>##>]

    AS '{[Product].[Product Categories].[Subcategory].&[1]}'   

  MEMBER [Product].[Product Categories].[All Products].[ Aggregation]

    AS ' AGGREGATE( EXISTING { [<##<SET!smchugh!{E3EF8F62-CD2B-422D-881F-ABAFB4AA3E16}>##>],

                               [<##<SET!smchugh!{844FB37C-1DA6-4536-B7A4-297650572576}>##>] }) ',

    SOLVE_ORDER = 0   

SELECT

  NON EMPTY { [Measures].[Internet Tax Amount], [Measures].[Internet Order Quantity] }

  PROPERTIES PARENT_UNIQUE_NAME, MEMBER_TYPE ON COLUMNS ,

  NON EMPTY { [Customer].[Customer].[All Customers].CHILDREN}

  PROPERTIES PARENT_UNIQUE_NAME, MEMBER_TYPE ON ROWS 

FROM

  [Adventure Works]

WHERE

  ( [Product].[Product Categories].[All Products].[ Aggregation] )

CELL PROPERTIES VALUE, FORMATTED_VALUE, CELL_ORDINAL

Take particular notice of the new member called [ Aggregation] that uses the 2 sets. If they are stripped out then and the MDX run in a much more sensible manner such as ....

WITH

  MEMBER [Product].[Product Categories].[All Products].[ Aggregation]

    AS ' AGGREGATE( EXISTING {

    [Product].[Product Categories].[Subcategory].&[31],

    [Product].[Product Categories].[Subcategory].&[1]}) ',

    SOLVE_ORDER = 0   

SELECT

  NON EMPTY { [Measures].[Internet Tax Amount], [Measures].[Internet Order Quantity] }

  PROPERTIES PARENT_UNIQUE_NAME, MEMBER_TYPE ON COLUMNS ,

  NON EMPTY { [Customer].[Customer].[All Customers].CHILDREN}

  PROPERTIES PARENT_UNIQUE_NAME, MEMBER_TYPE ON ROWS 

FROM

  [Adventure Works]

WHERE

  ( [Product].[Product Categories].[All Products].[ Aggregation] )

CELL PROPERTIES VALUE, FORMATTED_VALUE, CELL_ORDINAL

...same data returned but with a massive time difference. It's worth actually running both these queries on [Adventure Works] as it's a good indication as to what a difference little changes in your MDX can make.

Suffice to say the second query is now how it's structured in PerformancePoint Server 2007 RTM so it's well worth getting this installed as soon as possible as not only will queries return faster but some nasty timeouts on your dashboard will be avoided.

Technorati tags: , , ,

PPS CTP4 VPC DOA

If there weren't enough acronyms in the world already....

A little tip from a colleague who has been working closely with Performance Point Server over last couple of months and had the misfortune to use the Virtual PC from Microsoft to demo the toolset.

It performs like a dog. I can confirm this as I'm watching it "not respond" for no reason after an "indeterminable" amount of time. After somebody complained about this they were told that you have to defrag the hard drive within the virtual PC 10 times. I'm currently on defrag number 5 of 10.....

8 out of 10.....

10 out of 10.....

......

......

......

Well after a little play around I have to admit to some non-scientifically measured performance benefits that have been noted, oh, until it crashed on me with the dashboard designer deciding that the task of saving needed 100% CPU and to deny any interaction whatsoever.

Unfortunately what I cannot answer with any clarity is....

  • 9 clicks will make it perform worse.
  • 11 clicks will make it perform better, or worse.
  • the dashboard designer needs a quad core xenon to save anything......

Tuesday, October 09, 2007

Reporting Services and Virtual Directories, a Warning

Just a very quick post to make people aware of a little thing that happens with reporting services when you install it on a server and install it under the same virtual directory (e.g. you install it on a SharePoint server and don't allow SharePoint to control the reporting server, e.g. when you have Performance Point Server 2007 CTP4 and are using it with reporting services - mail me if it's confusing).

We found that Report Manager (http://servername:1234/reports) worked correctly but any access via the web service (http://servername:1234/reportserver) would return an error along the lines of...

Microsoft.ReportingServices.UI.WebControlConnection,
ReportingServicesWebUserInterface, Version=9.0.242.0, Culture=neutral,
PublicKeyToken=89845dcd8080cc91 does not implement
IReportServerConnection or could not be found

Spurious to say the least. The solution for us was to remove a couple of lines from the web.config file within the <appsettings> tag in the reporting services virtual directory...

<add key="ReportViewerServerConnection"
value="Microsoft.ReportingServices.UI.WebControlConnection,
ReportingServicesWebUserInterface, Version=9.0.242.0, Culture=neutral,
PublicKeyToken=89845dcd8080cc91" />
  <add key="ReportViewerTemporaryStorage"
value="Microsoft.ReportingServices.UI.ReportViewerTemporaryStorage,
ReportingServicesWebUserInterface, Version=9.0.242.0, Culture=neutral,
PublicKeyToken=89845dcd8080cc91" />

This gets the web service running but guess what, reports via the report manager won't work anymore.

The problem, I am reliably informed on 'tinternet, is that web config's are inherited in IIS therefore the reporting services web service web config settings are inherited from the report manager web config which means that the setting you removed is needed for report manager but breaks the web service.

The solution, is to add the values back into the web config for the report manager but make sure that the web config for the web service virtual directory has the value...

<appSettings>
    <clear/>
</appSettings>

Hey presto, both the report manager and the web service now function correctly.

Apologies if my descriptions of web things are a little on the vague side but I'm reading it as it happens.

This Week I Have Been Mostly Listening To...

Radio One Established 1967

Radio One Established 1967 - Various Artists

(and not just for the Girls Aloud cover of Teenage Dirtbag...)

Technorati tags: , ,

SQLBits

Waking up early on a Saturday morning is not one of my greatest traits but I managed to overcome that weakness to attend SQLBits at the Microsoft campus in Reading at the weekend.

There was a track on BI as well as one on SQL Server 2008 so plenty for everyone to be interested in. The best moments for me were seeing my friend and colleague Sutha present on managing early arriving facts and the new spatial data types in SQL Server 2008.

I need to go through the details of Keith Burns presentation again before I can post about it with any real conviction but the general consensus within the room was that "right here right now I don't know what I'm going to use this stuff for but I'm going to find a way to implement it". In other words how we developer types define "cool".

BIDS and TFS

I'm a big fan of Team Foundation Server and how it can be used to better organise analysis, development and delivery. I've been especially impressed with a number of the features of Data Dude (or Visual Studio 2005 Team Edition for Database Professionals for the full mouth full), certainly since SR1, and how, with some work, deployments can be made a lot more efficient.

Let me give you an example. I have multiple environments, dev, UAT and production. All of them will have the same code base but different configuration settings for example we will have a table containing configuration settings or source file locations and these are all likely to change depending on what we are testing or deploying. Now Data Dude allows you to embed variables in your pre and post deployment scripts meaning that I don't have to change anything when it comes to things like allowing a accounts read access to data on dev that should not have read access on prod.

All pretty simple really and well worth installing for these kind of features (notice how I gloss over the cost....!)

Now the point "Visual Studio 2005 Team Edition for Business Intelligence Developers". has a nice ring to it but stick it into Google and at the time of writing nothing comes back. I can think of a number of scenarios where non intrusive deployment would be useful without have to rely on post deployment XMLA or scripts using rs.exe or even the wonderful package deployment wizard.

Some of the things I've named may not seem too reasonable but should you really have to go a separate utility every time you want to release code having multiple settings depending on the server or the environment or the fact it's a 64bit deployment or not. Analysis Services seems a prime candidate for this kind of behaviour as user access is defines in roles but I don't want to have to define them within role membership then remove them or go to the Analysis Services Deployment Wizard and define different configuration files when I've invested in a nice shiny Team Foundation Server and trained everyone up on it's use.

Anyway I would be most interested in hearing any thoughts on the above subject as this kind of thing is taking up a lot of time and more and more clients are jumping onto the TFS / agile bandwagon one even demanding to know why this can be done with the databases and not the cubes. You see the point..!

Technorati tags: , , ,

SAP have BO....!

Oooooh, the smell..!

I couldn't resist the almost register like title for the post....

It appears after many many months of hearsay and conjecture someone is finally going to stump up the cash and buy Business Objects. There's been a lot of talk around this area especially with Hyperion being absorbed by Oracle and further rumors about the future of Cognos.

Microsoft seem to be content with buying the smaller players and partnering with the similar smaller vendors to compliment their own technologies and now they've gone their own route for budgeting and planning it seems very unlikely they will do anything else but carry on that trend.

It will be interesting to see what effect this has, if any, on the close ties that SAP and Microsoft have had recently. We shall see.

Wednesday, September 26, 2007

This Week I Have Mostly Been Listening To...

AirTraffic-FracturedLife

Air Traffic - Fractured Life

Technorati tags: , ,

Developing with MDM

Technorati tags: , , ,

There has been some recent announcements on Microsoft's acquisition of Stratature that have begun to solidify their approach and roadmap for Master Data Management. The MDM tools, as most people had previously guessed, will be very much aligned with the Office toolset and made part of the SharePoint. Personally I think the key thing about this is how this data will be accessed in my day to day job pulling data from here and putting there for user consumption.

Jamie Thompson has a video on YouTube (embedded below) detailing how to consume web services in SSIS 2008. Couple this with Bulldog's (Microsoft's name for absorbing Stratature's EDM+ product) ability to expose the master data and meta data as web services and all kinds of pre-built SOA components become possible.

Although possible with the existing set of tools in SSIS, I would be surprised if a data source component didn't appear at some point for SSIS 2008 that was specifically designed to retrieve master data from Bulldog. After the constant battles I've had getting master data out of organisations due to the cost of managing and maintaining it, having a tool that looks after the work flow and provides simple generic interfaces to the data will be very beneficial indeed.

Bring on the CTP next year....

Tuesday, September 04, 2007

The Last Month or so I Have Been Mostly Listening To...

Our Love to Admire

Interpol - Our Love To Admire

The Young Knives Are Dead... and Some

The Young Knives - The Young Knives Are Dead... and Some

Girls and Weather

The Rumble Strips - Girls and Weather

Because of the Times

Kings of Leon - Because of the Times

Technorati tags:

Extended Absence.....

After what appears to have been an eternity of non posting (which coincidently seems to occur every August, holidays, V festival etc.). I shall attempt to make up for lost time.

So lets ease into this gently with music posts first....

Technorati tags: , , ,

Friday, June 29, 2007

Sorry it's been a bit quiet...

The lack of posts recently is down to my laptop being in the repair shop.....

No fault of mine just general wear and tear. As soon as I get it back then I'll have some stuff on SQL Server 2008, more MDM bits and some interesting things I've been up to with SharePoint 2007, not forgetting the things that i've been listening to. There's quite a lot of those as we enter festival season and yes I will be making the traditional trip to the V Festival in Chemlsford if anything just to see McFly...! (If you know me you know the truth).

Laters

Steve

Friday, June 08, 2007

Just when I was about to carry on talking about MDM...

...Microsoft go and announce this.

http://blogs.msdn.com/knight_reign/archive/2007/06/08/microsoft-completes-stratature-acquisition.aspx

Looks good for an end to end Microsoft architecture but it will be interesting to see what happens when it's working outside of its comfort zone.....

Technorati tags: , ,

Tuesday, June 05, 2007

What do they say about boys and their toys....

Well now  I have my new version of writer I'm

posting like a demon....

Anyway let's get up to speed with some SQL Server 2008 web casts coming in July...

SQL Server 2008 LiveMeeting Schedule

Analysis Services - Dimension Design 06/12/07 11:00 am PDT

Change Data Capture 06/13/07 11:00 am PDT

Star Join Query Optimizations 06/19/07 11:00 am PDT

Table Valued Parameters 06/22/07 11:00 am PDT

Declarative Management Framework 06/26/07 11:00 am PDT

MERGESQL Statement 06/29/07 11:00 am PDT

Technorati tags: , ,

Windows Live Write Beta 2

I've blogged about it in the past and it's still a great little tool. You can get it from here.....

http://writer.live.com/

Technorati tags: , ,

SQL Server 2008...

The June CTP is out....

Get it while it's hot....!

https://connect.microsoft.com/SQLServer/content/content.aspx?ContentID=5395

Wednesday, May 23, 2007

This Week I am Mostly Listening To...

The Rakes - Ten New Messages

The Evil That is Master / Meta Data - Part 2 (or the one where Steve talks about Ireland..!)

In the last post on this subject I talked about some of the key attributes of master data and meta data management and how it is intrinsically synched with what I call the Information or Data lifecycle. Now I would like to elaborate on this and identify some of the advantages that investing in this kind of strategy can provide.

Embarking on corporate wide gathering of all things data requires investment at all levels. Time, effort, money and most importantly commitment are essential. But ensuring a business receives any kind of ROI (return of investment which in the early part of my career I thought meant Republic of Ireland, hence the title) before that kind of commitment takes place can prove daunting and a little difficult to say the least. Let's forget the cons for a moment and look at some examples of the key advantages a data and information strategy can provide.

  • Everyone on the same page.
  • Information means the same thing throughout the business.
  • Reduced cost of new reporting and / or analytical requirements.

This doesn't look like a very extensive list and to be honest if someone presented me with a pitch like this I would be showing them the exit pretty rapidly but when you further examine the nature of each of these bullets you see that they are rooted incredibly deeply in practically all business processes and systems in place within an organisation. Key to the whole concept is that meta data and master data are not only for use within reporting systems. It's just that any projects that tend to drive this kind of requirement are also implementing some kind of reporting mechanism.

Lets take a step back and look at a simplified implementation of a number of reports. First we gather the requirements for the reports which would be based upon an existing set of data, possibly sheets of paper possibly a database storing transactions. Then the nasty business of performing analysis on said data, conforming it into your existing dimensional structure or creating new dimensions from scratch. On defining the model from which you will base your reporting you can finally start building reports.

So how could we improve this process and reduce the time taken to turn around a reporting requirement. First having some degree of knowledge of the system prior to a reporting requirement coming along would be advantageous but that's not the way the world works. Looking at a single report as a deliverable we would need to understand where the constituent data is sourced from. The report, for example, has customers, geographical breakdown, product type, number of orders and order value. Very simple but already pulling data perhaps from CRM, product catalogue and ordering systems.

When building a picture of the data held within the company it is very important that ownership is established. Who owns the customer data? Who is responsible for maintaining the product catalogue? These are the people that own these data elements within the organisation and are therefore ensuring the quality of not only the data in their own systems but also the reporting that is based on this data.

The point to this is that data quality needs to come from the top down. BI projects are generally just the the catalyst for this but should also be used as means of improvement in the source systems. Too often has data cleansing been hooked on to the back of a BI project and weighed it down with responsibility that should lie elsewhere.

Ok enough of this business type talk of responsibility and stuff. Next time I'm going to go into what master data and meta data are actually made of.

Unit Testing SSIS

SSIS packages are almost like a mini application in their own right. On most occasions there is one or more input items of data that may consist of either a single variable value or an entire data set. This will follow a transformation or validation process before a final output that could again be in a number of different formats.

Unit testing these packages should involve minimal change to the structure or behaviour of the package itself. The risk of influencing the code behaviour through the testing process is as great as an incorrect deployment or missing a particular testing scenario.

The most important factor in the process of testing a package is to understand how it will react in controlled circumstances and be in a position to test the success of the anticipated result. Testing this using the natural output of the package, for the reasons discussed previously, will provide the most robust results.

Due to some of the current debug limitations of SSIS and taking into account the need to keep the package structure and design static; it is only really possible to effectively test the control flow of a package whilst remaining ‘hands off’.

Let's look at a simple package example;

The same type of output would be taken from the other standard control flow tasks. The execute SQL task would have an initial state, a completion state and a resulting success state based on the other outputs. The initial state of n rows in the destination table before and 0 rows after the step has executed. This is measured by examining the row count in the table before and after execution and comparing the value with the expected result. For the archive file system task all states would be measured using a different mechanism and so on and so forth.

Essentially this means that whatever the task there may be numerous methods of gathering the data necessary to confirm whether the test process has been successful. Simplifying the process of measuring the results of a test would make applying a standard testing mechanism far easier to implement.

Currently packages provide the ability to perform some manual logging as I've posted about in past. This can be used to establish whether tasks have completed successfully or not but where a measurement is needed to confirm this, this type of logging is lacking. For example, truncating a table will not provide you with row count confirming the number of rows affected or the subsequent amount of rows left in the table whilst a delete statement would. It would not be wise to change all truncates to deletes to allow this information to bubble up to the logging process and therefore allow capture of the state of the task before and after execution.

There are a number of different ways of trying to perform strict, robust SSIS unit testing of which I've generalised them into 3 options.

First is to create a more robust testing methodology that has processes particularly for unit testing SSIS packages based on the control flow. The success criteria should be based on use cases defined at the package design stage. Testing each of the steps within the control flow of the package with documented pre-execution criteria, expected output and the subsequent result.

This does not lend itself to any kind of automated testing approach and would involve stepping through the package control flow and logging the results.

The second option allows for the automation of tests and the capture of results but would require another application to control the execution of the process and establish pre-execute criteria in addition to measuring the post-execute result.

Using the unique identifier for each of the control flow tasks, during the pre execute and post execute process of each step a lookup would be made to either a database or configuration file containing the method call that would initiate the processes necessary to prepare the execution and subsequently validate and measure the success on post execute.

A change like this would mean integrating a procedure call, perhaps using the CLR, to execute such tasks based on an internal variable indicating that unit test, or debug mode was enabled within the package. Whilst providing a number of advantages in automated testing and capture of the test results there would be a great deal of work required in the preparation of the test. This would all have to be completed in addition to that suggested in the first option as the pseudo code necessary to design each of the test criteria would be based on the use cases defined at the package design stage.

The final option would be to remove the more advanced mechanism from option 2. The pre analysis and use case definition would still be required but in this option additional test scripts would be placed in the pre and post execution events of the package. This would mean embedding code into the package that would only be used during unit testing that could possibly be switched off using a similar variable to that in the second option.

Thought it would be possible to automate a great deal of the testing for the package it would mean changing the structure or behaviour of the package directly and increase the danger of introducing problems with script tasks that have previously been experienced on 32bit to 64bit transitions.

So there you go. There are ways of doing SSIS unit testing. There are even ways of automating the process but it's not a quick win. It doesn't remove the need to establish good formal testing strategies and it certainly isn't going to appear out of a box, well not soon anyway.

Wednesday, May 16, 2007

This Week I am Mostly Listening To...

The Pigeon Detectives - Wait For Me

Tuesday, May 01, 2007

This Week I am Mostly Listening To...

The Maccabees - Coulour It In

Tuesday, April 03, 2007

The Evil That is Master / Meta Data - Part 1 (or the one where Steve talks about socks..!)

Master data and metadata. A subject close to my heart due to it's significant importance in what I call the data lifecycle. Data? Lifecycle? What on earth is he talking about now, I just wanted to get Oracle talking to SSIS? Well let's go a little off subject here and use a bit of a euphemism.

Take something simple that I think we all learnt at school, the water cycle. This is the continuous movement of water as it shifts location and state from ocean to atmospheric to ground water. Now I liken this to the way data moves through an entity whether it be an organisation or group of systems.

A good example of this is a common scenario in financial reporting. An accountant (that's the cloud up there) will read their profit and loss report for a particular department and use this as to calculate the following years budget or forecast. These estimates will then be entered into the budgeting and planing system (that would be the mountains, more likely though it's Excel :). The budget and forecast are imported into the data warehouse where the profit and loss report (the ocean perhaps?) is generated which is read by another accountant looking at the companies performance, who........ ad infinitum.

A very typical example but it demonstrates the fact that the behavior of data within an organisation is very organic and in a constant state of flux. Just because the original piece of data is sitting in a table somewhere it doesn't mean it hasn't evolved into a different beast elsewhere with different properties and meanings. Simple as it sounds, this makes life a little complicated when you add influencing factors such as SOX (Sarbanes Oxley) compliance that requires the demonstrability of internal controls. In BI speak this could be someone changing an attribute of a dimension member and proving who did it, when and why. One tiny change which to a developer may be minor but to a CEO moves them from the red to the black, exactly the kind of thing SOX tries to stop.

Now all this talk of oceans, cycles and socks is all very good but doesn't bring us any closer to knowing what the hell to do about managing master and meta data. Ok, lets break down some of things I've mentioned in to some key bullets.

  • Dimension Management
  • Compliance
  • Process Flow

This list identifies some of the major reasons for having and requirements of any meta data and master data management mechanism.

In the next part I'll cover the these elements in more detail and how they can contribute to a more streamlined data strategy.

Monday, April 02, 2007

This Week I am Mostly Listening To....

Arcade Fire - Neon Bible

If an Alien Dropped Down, Right Here, Right Now....

I have just had the displeasure / pleasure (please delete as necessary) to have spent a little time in Barcelona witnessing the English football teams victory over Andorra.

According to Wikipedia Andorra is a co-principality with the President of France and the Bishop of Urgell, Spain as co-princes.

According to the majority of the traveling supporter base of the England team, Andorran's are either taxi drivers, estate agents or butchers with the odd police man thrown in for good measure.

So where does the alien in the title come in? Well, the alien that dropped into the Estadi Olimpic on Wednesday night would have probably thought;

  1. The human population is 98% male
  2. The human population is 98% hairless
  3. The human population is unable to construct sentences coherently
  4. The human population are masochists.

For the love of God, lets hope the aliens go to a Brazil game instead...!

Friday, March 16, 2007

This Week I am Mostly Listening To...

The View - Hats Off to the Buskers

rsInternalError - Now This One's a Doozy...!

If you're running Reporting Services and keep getting random rsInternalErrors that may or may not return errors in the logs concerning things like "There is no data for the field at position n", before you start to rip apart your reports looking for bugs check the make of the processors on your server.

There is a little problem-ette that manifests itself as these kind of errors when you have AMD processors in your machine. We have 2 servers on a clients site that have the 64bit Opteron chips in that were both demonstrating these errors with different reports, data sources and pretty much anything you can imagine. After a little hunt around the information hyper global mega net we found some newsgroup postings points to this Microsoft knowledge base article.

http://support.microsoft.com/default.aspx/kb/895980/en-us

Adding the /usepmtimer switch in the boot.ini file appears, for now, to have cured the problem that was causing very simple reports to fail 20-30% of the time on an under stressed production server.

After my recent problems with SSIS 64bit this was the last thing I needed but suffice to say my cup of joy was overflowing as a dev server that pretty much gave me an rsInternalError on command has worked flawlessly for several hours. If anything changes I'll be sure to mention it.

Thursday, February 15, 2007

This Week I am Mostly Listening To...

Bloc Party - A Weekend in the City

Tuesday, February 06, 2007

Other Blogger's

A colleague of mine has started blogging recently mostly in and around the SSIS area and I can categorically say he know what he's talking about.

Check it out here....... Colin's blog

Monday, February 05, 2007

This Week I am Mostly Listening To...

Cold War Kids - Robbers & Cowards

The Adventures of SSIS and Oracle in 64bit Land..!

There's plenty of information out there spread out amongst the web concerning migration of packages to a 64b bit platform. I thought in the interests of fellow professionals out there I should share some of my experiences.

Firstly installation. Look at the C: drive on a 64bit Windows Server 2003 installation and you'll notice 2 "Program Files" folders one of which is suffixed (x86). Any 64bit applications are installed in the Program Files folder and 32bit applications are installed in the Program Files (x86) folder. So far so good and pretty simple but this is where you will encounter the first of several problems getting Oracle connectivity to behave on a 64 bit server.

Before I go into any detail here's a list of things that you need to know when you have installed 64bit SQL Server on an x64 server.

  • Visual Studio (BI Dev Studio) is a 32bit application.
  • SQL Server Management Studio is a 32bit application.
  • dtexecui.exe is a 32bit application.
  • dtexec.exe has both 32bit and 64bit versions.
  • There are x64 and x86 versions of the Oracle provider available.
  • SQLAgent is a 64bit process.

OK with all this information let's begin. Firstly if you develop packages on a 32bit platform, which is likely, and you are connecting to Oracle, use the Oracle OLEDB Provider, NOT the Microsoft provider for Oracle. For starters there is no 64bit version of this and secondly MS has no plans to create and support one.

Now you're using the Oracle OLEDB Provider we should be able to avoid a few problems later on although I would like to make note of one problem I had but couldn't find a resolution for. In a project I'm working on we needed to execute an Oracle stored procedure before we could access the data. When we ran the procedure using the MS provider (MSDAORA) everything was fine but as soon as we moved onto the Oracle provider (Oracle.OLEDB.1) the package would just hang as soon it attempted to access Oracle. This magically rectified itself when we moved servers. Just letting you know so beware.

Launching the 64bit Oracle universal installer will provide you with four options. Instant client, runtime, custom or administrator. Now I am not any kind of Oracle expert so I didn't go near the "custom" option and stuck with the other three default selections but I could only get the Administrator install to provide me with a working registered Oracle provider. Now this may work for someone else but on this server, only Administrator worked.

With the Oracle provider installed and my connection string configuration using the Oracle.OLEDB.1 provider there's a series of interesting little test you can run. First create a package that just sucks some data out of an Oracle table and places it in to a row count data flow task. We don't want to do anything, just get the Oracle end of this process working. Execute the package in the following ways;

  1. Double click on the package and execute.
  2. Execute it through the command line interface (using the dtexec.exe in C:\Program Files (x86)\Microsoft SQL Server\..\dtexec.exe".
  3. Execute it through the command line interface (using the dtexec.exe in C:\Program Files\Microsoft SQL Server\..\dtexec.exe".
  4. Execute it as a SQL Server job.

The interesting thing is that the first two methods of executing the package will fail and the second two will be successful. Why? Well this is because a 32bit process cannot see the 64bit drivers. The first method initiates dtexecui.exe which is a 32bit process the same as the dtexec.exe in the (x86) program files folder. Running the process using dtexec.exe in the normal program files directory or as a SQL Server job executes the package as a 64bit process which quite happily sees the 64bit Oracle drivers.

This can prove to be a problem if you want to debug your package as Visual Studio is 32bit and can't see the Oracle drivers meaning you'll get messages about things not being registered even though you know you did. Fortunately you can install the 32bit and 64bit Oracle drivers on the same machine meaning each process type then has the ability to see the Oracle drivers. Run the same four methods of executing our test package and the same thing will happen. The first two will succeed and the second two will fail.

This issue is down to a problem with the Oracle provider. The Oracle provider appears to have a particular dislike of being used by processes that have any mention of a bracket, Program Files (x86) being one of them. So even though you have correctly installed the 32bit Oracle providers on the server, you can't use them. The solution, copy the Microsoft SQL Server folder and Visual Studio folder into a new folder on your C: drive called ProgramFilesx86 and all of a sudden you development environment can see the Oracle drivers and work. You need to use the dtexecui.exe in there as well if you want to interactively execute packages.

Remember though this is only an issue if you need to perform any debug of packages that use Oracle providers on a 64bit server. Quite a specific problem but quite common none the less.

Now you've got over that problem solved(ish) you can start to build beautiful complex packages utilising every control and data flow task available to you. But then you get strange errors. Ones that are centered around script tasks especially BUT not always.

Personally I've had a couple of issues moving packages onto 64bit servers, excluding the Oracle problems mentioned above. First script tasks sometimes need to be recompiled on the 64bit hardware. This is kind of hit and miss but appears to be common in scenarios where you have a template package and you copy and change the definitions in data flows. The recommendation is open up each package and script on 64bit to be on the safe side although I have had a number of occasions where we've had to delete the script task and recreate it get the same code working correctly. Very odd.

This issue doesn't seem to be isolated to script tasks though. Execute package tasks seem to sometimes have similar behavior where reopening the package on the 64bit hardware corrects it.

Overall the benefits of SSIS and 64bit are obvious but be careful during development and if you can't develop in 64bit then try to at least have a server that is 64bit for testing this kind of thing and if you do get any kind of strange issues cropping up, try to bear in mind the apparent inconsistencies in the two platforms.

Update:

Corrected some spelling mistakes and disgustingly bad grammar..!

Technorati tags: , , , ,