Thursday, February 15, 2007

This Week I am Mostly Listening To...

Bloc Party - A Weekend in the City

Tuesday, February 06, 2007

Other Blogger's

A colleague of mine has started blogging recently mostly in and around the SSIS area and I can categorically say he know what he's talking about.

Check it out here....... Colin's blog

Monday, February 05, 2007

This Week I am Mostly Listening To...

Cold War Kids - Robbers & Cowards

The Adventures of SSIS and Oracle in 64bit Land..!

There's plenty of information out there spread out amongst the web concerning migration of packages to a 64b bit platform. I thought in the interests of fellow professionals out there I should share some of my experiences.

Firstly installation. Look at the C: drive on a 64bit Windows Server 2003 installation and you'll notice 2 "Program Files" folders one of which is suffixed (x86). Any 64bit applications are installed in the Program Files folder and 32bit applications are installed in the Program Files (x86) folder. So far so good and pretty simple but this is where you will encounter the first of several problems getting Oracle connectivity to behave on a 64 bit server.

Before I go into any detail here's a list of things that you need to know when you have installed 64bit SQL Server on an x64 server.

  • Visual Studio (BI Dev Studio) is a 32bit application.
  • SQL Server Management Studio is a 32bit application.
  • dtexecui.exe is a 32bit application.
  • dtexec.exe has both 32bit and 64bit versions.
  • There are x64 and x86 versions of the Oracle provider available.
  • SQLAgent is a 64bit process.

OK with all this information let's begin. Firstly if you develop packages on a 32bit platform, which is likely, and you are connecting to Oracle, use the Oracle OLEDB Provider, NOT the Microsoft provider for Oracle. For starters there is no 64bit version of this and secondly MS has no plans to create and support one.

Now you're using the Oracle OLEDB Provider we should be able to avoid a few problems later on although I would like to make note of one problem I had but couldn't find a resolution for. In a project I'm working on we needed to execute an Oracle stored procedure before we could access the data. When we ran the procedure using the MS provider (MSDAORA) everything was fine but as soon as we moved onto the Oracle provider (Oracle.OLEDB.1) the package would just hang as soon it attempted to access Oracle. This magically rectified itself when we moved servers. Just letting you know so beware.

Launching the 64bit Oracle universal installer will provide you with four options. Instant client, runtime, custom or administrator. Now I am not any kind of Oracle expert so I didn't go near the "custom" option and stuck with the other three default selections but I could only get the Administrator install to provide me with a working registered Oracle provider. Now this may work for someone else but on this server, only Administrator worked.

With the Oracle provider installed and my connection string configuration using the Oracle.OLEDB.1 provider there's a series of interesting little test you can run. First create a package that just sucks some data out of an Oracle table and places it in to a row count data flow task. We don't want to do anything, just get the Oracle end of this process working. Execute the package in the following ways;

  1. Double click on the package and execute.
  2. Execute it through the command line interface (using the dtexec.exe in C:\Program Files (x86)\Microsoft SQL Server\..\dtexec.exe".
  3. Execute it through the command line interface (using the dtexec.exe in C:\Program Files\Microsoft SQL Server\..\dtexec.exe".
  4. Execute it as a SQL Server job.

The interesting thing is that the first two methods of executing the package will fail and the second two will be successful. Why? Well this is because a 32bit process cannot see the 64bit drivers. The first method initiates dtexecui.exe which is a 32bit process the same as the dtexec.exe in the (x86) program files folder. Running the process using dtexec.exe in the normal program files directory or as a SQL Server job executes the package as a 64bit process which quite happily sees the 64bit Oracle drivers.

This can prove to be a problem if you want to debug your package as Visual Studio is 32bit and can't see the Oracle drivers meaning you'll get messages about things not being registered even though you know you did. Fortunately you can install the 32bit and 64bit Oracle drivers on the same machine meaning each process type then has the ability to see the Oracle drivers. Run the same four methods of executing our test package and the same thing will happen. The first two will succeed and the second two will fail.

This issue is down to a problem with the Oracle provider. The Oracle provider appears to have a particular dislike of being used by processes that have any mention of a bracket, Program Files (x86) being one of them. So even though you have correctly installed the 32bit Oracle providers on the server, you can't use them. The solution, copy the Microsoft SQL Server folder and Visual Studio folder into a new folder on your C: drive called ProgramFilesx86 and all of a sudden you development environment can see the Oracle drivers and work. You need to use the dtexecui.exe in there as well if you want to interactively execute packages.

Remember though this is only an issue if you need to perform any debug of packages that use Oracle providers on a 64bit server. Quite a specific problem but quite common none the less.

Now you've got over that problem solved(ish) you can start to build beautiful complex packages utilising every control and data flow task available to you. But then you get strange errors. Ones that are centered around script tasks especially BUT not always.

Personally I've had a couple of issues moving packages onto 64bit servers, excluding the Oracle problems mentioned above. First script tasks sometimes need to be recompiled on the 64bit hardware. This is kind of hit and miss but appears to be common in scenarios where you have a template package and you copy and change the definitions in data flows. The recommendation is open up each package and script on 64bit to be on the safe side although I have had a number of occasions where we've had to delete the script task and recreate it get the same code working correctly. Very odd.

This issue doesn't seem to be isolated to script tasks though. Execute package tasks seem to sometimes have similar behavior where reopening the package on the 64bit hardware corrects it.

Overall the benefits of SSIS and 64bit are obvious but be careful during development and if you can't develop in 64bit then try to at least have a server that is 64bit for testing this kind of thing and if you do get any kind of strange issues cropping up, try to bear in mind the apparent inconsistencies in the two platforms.

Update:

Corrected some spelling mistakes and disgustingly bad grammar..!

Technorati tags: , , , ,

Monday, December 11, 2006

Making The Client Happy - The Real Reason For Yesterdays Post That Rapidly Turned Into Some Kind Of Rant...!

If there's one thing that users hate it's being kept in the dark. So communication is essential and to be honest BI is all about communication. Articulating a series of 0's and 1's via OLAP or reporting something tangible that an end user can get real benefit from is what it's all about.

So we have a process that may rely on another process outside of the control of the BI infrastructure. This is a typical scenario and due to the 24 hour worldwide mannerisms of a lot of the BI implementations I've worked with, that process may be smack bang in the middle of someone's working day.

Scenario established, what are the problems we have here. Well first there's the problem of processing whilst people are working but with the performance tuning techniques available to us in the Microsoft BI space that shouldn't have so much of an impact on your user base. Really you just want to delivery new data to the user as quick as possible.

The real problem here is a user not knowing when their new data is available. Does that balance include yesterdays transactions yet or not? A typical question that may get asked in financial departments at month end something that would be asked a lot more regularly than you would like.

So communication is a problem, or lack of anyway (see the way I did that, started off talking about it and brought it round again to the subject, you're impressed, I can tell..!).

What can be done about it? There are a number of clever little things we should be able to do as part of our BI profession. Use our skill set and let the consumers of our data know the data is there to be consumed. First method, deploy a report at the end of processing to the user or send an information e-mail. Database mail in SQL Server 2005 makes this a pretty simple task these days. This could be a standard link telling the user base to go to a reporting services link and open a report telling them what has happened in the data processing. Problem with this though is it is reliant on a patient user base who will happily stare at the mail box until that little shining light of hope gets sent from "SQLSVR978BSMNT" (I do miss the days when there were fewer servers and they had names based on characters from The Simpsons or something like that.).

OK so your client / user isn't patient or it's 4:30 on the Friday before Christmas (topical huh..!), either works, and you don't want to have to go answer 20 calls from finance departments around the world demanding to know where the data is. Let's look at what they need to know. Take a typical end to end process, it may go something like this;

  • Extract data from one or more source systems.
  • Process said new data into the warehouse / mart / etc.
  • Process data into cube.
  • Distribute reports (or warm cache, it's useful have this catch all at the end)

Now this is high level generalisation of what may be going on but being able to let the uses know this much gives them a massive increase in the perception of the system. Even to just know that the sources system data has not ben extracted empowers the user, if there are aware of where the data is coming from, and why shouldn't they be, to get on the phone and ask New York why their figures haven't been entered yet.

I've already talked about making this happen via sending reports out but lets face it, this isn't the ideal way of doing this when there are numerous users and numerous steps within the process to communicate. So we make the user pro-active. Go to the portal or website and find out what the hold up is or where the problem lies. All we're doing is taken something that has been happening in customer service departments  for years and applying it to the BI systems that we put in place and to be honest with our experience we really should know how to provide someone with beneficial information in a number of formats.

So what is my point in all this? There has been an idea bubbling around my head for a few months now around this subject and visualising end to end processes for clients, mostly based on SSIS and the simple logging model I've posted about previously. The idea was to use Visio to expose data in a diagrammatic format to the users showing them something similar to the SSIS control flow, you know greens, reds, yellows. This became all the more interesting after the recent UK BI user group at Reading where David Parker was presenting discussing visualising data in Visio, essentially the same topic.

There, that's the idea, now I'm going to put it into practice. Stay tuned and if anyone else has had any experience in doing what I've said then I would be really interested in hearing.

This Week I am Mostly Listening To...



Raconteurs - Broken Boy Soldiers

Thursday, November 30, 2006

Making The Client Happy...

Being a consultant involves meeting a lot of different people in a lot of different situations. Most of the people I meet are by an large pretty nice but not all. I shall stop there for fear of incriminating myself :).

One thing is consistent throughout everything I do in my job though and that is that I try to deliver the best solution possible within the constraints placed upon me, whether that be ambitious requirements, time, money, equipment, coffee, etc, etc. I do take a certain pride in what I do and the best thing to come out of it at the end is the successful delivery of a project with the users grateful for the spanking new BI implementation my colleagues and myself have put in place. And hey, if you get to play around with a few extra toys then that's a bonus...

This may counter the myth that as a consultant my role is to bleed the client dry and ensure that the people that command my time have enough cash to bathe in asps milk, ordering people to remove all of the blue M&Ms from large jars made from empty Playstation 3 carcasses whilst drinking fine wine made from the grapes of pandas! (In my head I know what I mean).

I do what I do because I'm interested in it and it pays fairly well at the moment. Ideally I would be a tester at a major games developer, but I can't see that happening until the mortgage is paid off, until then I'll just keeping BI-ing my heart out in the interest of the client.

There was a technical reason for this post but I think my rant went on long enough so I'll perhaps share that tomorrow when I've had a chance to investigate further..

Evening After The Night Before....

Last night I had the pleasure of attending the 2nd UK BI User Group and happened to ask Chris Webb when the next one might be and he answered probably about 6 months. Now go onto one of the IT job boards and type in BI, SSIS, OLAP or any of the many acronyms that relate to this kind of thing and you'll be overloaded with agencies and other organisations offering large sums of money and incentives to come and join them.

What's my point?

Well with that kind of level of interest in BI over the last 18 months and that interest not looking like it will wane in the foreseeable future I think there are a couple of things that should happen....

Firstly more people should attend, now that may be location as TVP (Microsoft UK, Reading) isn't the easiest of places to get to but MS kindly offer their place so who's going to refuse it.

Secondly more regular, things are moving pretty fast, people are coming out with new tips tricks and ways of adapting and implementing the MS BI platform so surely there are plenty more things that can be discussed, one look at a PASS event shows there a hell of a lot of ideas out there.....

I should do my bit as well I suppose, I could make some Turkey sandwiches to take, there might be some left overs in the next month or so..!

Tuesday, October 17, 2006

Failure - Data Source Does not Exist..! - Update

I got asked to post the demo project I had mentioned in the original post a little while ago but unfortunately I had lost it. Luckily I was able to pull enough from memory to build it again and make it available for you pleasure. Whilst not embedded with premier coding standards it certainly provides some direction on how this kind of thing is achievable.

The project is here, any questions then mail me at the usual address.

Apologies for the delay :)

Sunday, October 15, 2006

This Week I am Mostly Listening To...



Morning Runner - Wilderness is Paradise Now

Thursday, October 05, 2006

A Note From The Editor....

Just a little clarification here. I am not a professional Dolphin trainer nor am I an amateur one either although I did once swim with a dolphin in the Florida Keys and found it to be a quite unfulfilling experience as the Dolphin was in a pool that I would consider to be the Dolphin equivalent of one of those Japanese capsule hotels.

The freelance dolphin trainer title comes from a time in my youth when life consisted of doing the things from this song below.

Thank you for your attention in this matter.

Editor out..!

Next Gen Blogging

I really wish I could blog more but I do find that the things I want to blog about do tend take a large chunk of time so when things come along that make blogging a bit more streamlined, I' first in the queue. Firstly the blogging support in Word 2007 is pretty good but I found it a little restrictive on occasion so now I've switched to Windows Live Writer.

I'm not going to write anything about it but point you in the direction of this, where you can get it, and the video below. Direct from YouTube and embedded with a couple of clicks from Live Writer.

Enjoy....

Wednesday, October 04, 2006

Simple SSIS Logging - Part 3, Bad, Bad Rows..!

It’s all well and good knowing that there has been a failure in your package which has lead to a bleary eyed support technician getting phoned at 3 in the morning but sometimes it’s not really necessary to get Barry (we’ll call him Barry for the purposes of this post) out of bed. Your package may be bullet proof and work perfectly but the nature of data is that it will change and can quite often come back and bite you on the backside.

Take for example early arriving facts. When you start integrating a number of systems it is possible for things to start happening that don’t conform to the world according to the system owner. Take the example of an in-house order system operating completely separately to the companies CRM system. During the analysis phase of the data warehousing project it was determined that CRM was the source of Customer ID therefore that’s where customers get created and that’s where customers get maintained. But a new customer phones up and makes an order before the customers details are completed in the CRM mechanism due to one thing or another. Your data warehouse is quite likely to be collecting that fact row before the relevant CRM data has been correctly populated and passed on to the customer dimension. So what happens in your SSIS package when looking up the Customer key for that dimension, well not a hell of a lot.

To be fair there are numerous techniques for dealing with early arriving facts. Some involve the population of the Customer ID and retrieval of the key so that when the correct dimension member arrives in the dimension table, its attributes are populated automatically. The problem is that things aren’t always as clear cut as that. You have managed to capture the early arriving fact data and correctly allocate it to members in the Customer dimension but when board reports run that specify a particular customer type the data is incorrect, the numbers don’t roll up and for 24 hours you could be running around like a headless chicken trying to figure out where the missing data has gone, oh that’s until the next day when the customer dimension member is correctly updated and things mysteriously fix themselves .

So where does logging fit into all of this? Well knowing that we have an early arriving fact would be a good idea. So we have to log the early arriving fact and let the data pass through to allow it into the fact table.

This looks quite simple but when you’re trying to handle a number of dimensions and any of these could have early arriving facts then you can get have a lot of data destinations. In addition to all this there is also the fact that each package could have a different error table at each dimension lookup making logging, which should really be centralised and in a defined structure, a little reactive and difficult to deal with. Dealing with this may not be as difficult as it seems though and for the first time I’m going to mention a customer SSIS task that has been written with this in mind. A colleague of mine has written a component for just this type of situation. Whilst I would have liked to have completed all 3 parts of this Simple SSIS Logging piece without referring to any external code, this really does make life a lot easier. The full source code is available on the site and I recommend its download.

The Xmilfy component (to give it its officially christened name) will turn a series of columns and convert them into a single XML column that has some obvious benefits. In the above example the data is then written to a single table in my SSIS logging database which has a data type of XML allowing the data to be viewed in a number of ways. Extra information can be added into the data flow that specifies where the error occurred, the number of rows etc, again creating detailed and useful reporting. Want to know the number of early arriving for a particular package or even a particular dimension, easily done using the data in the table. Finally a warning can be raised using script to inform Barry, remember him, that some data problems occurred but now it’s a warning and can be handed over to the department in charge of wielding the large stick that is used to stop sales people logging sales before the customer is correctly created. Furthermore the system can be automated to alert said department taking Barry completely out of the loop. But wait a second, Barry is now worried about losing his job but luckily he’s been learning SSIS in his spare time.

It’s ok people, Barry is fine…!

This Week I am Mostly Listening To...



Boy Kill Boy - Civilian

Friday, September 29, 2006

SQL Server 2005 Metadata

Hidden in the depths of Microsoft’s useful download division in Redmond is a handy pair of utilities you may have discovered that analyse dependencies within and across SSIS and Analysis Services. The first tool, Dependency Analyzer, examines the meta data contained within SSIS packages and Analysis Services objects collating their inter-dependencies writing the results to a pre-defined database structure.

The second tool is the Dependency Viewer and provides a graphical interface displaying relationships between the analysed objects .


The amount of data is considerable and has some quite powerful possibilities. There has been a distinct gap in the MS BI product set, dependency and data analysis being two key areas that their competitors eulogise in large amounts about. This appears to be the beginning of some gap plugging before the release of Katmai (SQL Server 2007). In the meantime all the source code, examples and even some reporting interfaces are provided in the download.

I’m quite looking forward to plugging this tool into some of my end to end 2005 projects and looking at the results. Once I've got some good examples of how everything integrates together on this I'll post ome of the results.

Dynamic Partition Creation and Processing – Part 2

The first part of this post discussed how to identify whether a partition existed and create a partition dynamically. Although this post is a little on the suck eggs side I thought it would be better to write it for completeness. Processing a package can be performed using the exact same techniques that are used for dynamic partition creation thus the suck eggs comment.

First let’s look at the XMLA query for processing a partition similar to the one created in the previous post.

<Batch
xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"
>

<Parallel>

<Process
xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>

<Object>

<DatabaseID>Adventure Works DW</DatabaseID>

<CubeID>Adventure Works DW</CubeID>

<MeasureGroupID>Measure_Group_Name</MeasureGroupID>


<
PartitionID>200609</PartitionID>

</Object>

<Type>ProcessFull</Type>

<WriteBackTableCreation>UseExisting</WriteBackTableCreation>

</Process>

</Parallel>

</Batch>

Note there’s no mention of the partition name anywhere. The partition is processed using its ID and that is all we are really interested in. Replacing just that value in the Source expression in an SSIS Analysis Services Execute DDL Task will point us to the correct partition for processing.

Of course if the entire cube structure is built on the same partition naming strategy then creating s single script to process all of this periods partitions is relatively easy. The added bonus of being able to process partitions in parallel can prove a huge advantage in reducing the time taken to update the cube on whatever schedule you have.

These examples are based on cubes structured with monthly or period based partitions. Partition strategy is an important subject for both cube querying and processing performance. I’ll leave that to another post but I just wanted to state that employing a rules mechanism that say merged period based partitions into a single year partition whilst creating new partitions on the fly does not need to be a massively complex process. The methods I’ve discussed in the two posts are probably some of the simpler ways of implementing this.

Tuesday, September 26, 2006

This Week I am Mostly Listening To...



The Young Knives - Voices of Animals and Men

Dynamic Partition Creation and Processing – Part 1

Processing cubes in SSIS is pretty simple if all you want to do is pick a partition and process it but if that partition is time based, which is quite likely then the partition you process each day, week or month will change. So what are your options?

First you can change your package each month, or use a variable with the partition name. Probably not the best of options as someone is going to have to physically make the change each month or come up with some clever method of changing your variable using something like VBA, Excel, etc. Inefficient if you ask me and defeating the object. Couple this with creating a years’ worth or partitions or creating them every time a new one is required and the whole task becomes quite laborious.

So what else can be done? My personal favourite is to use dynamic XMLA but this has a few of its own pitfalls you have to be aware of. First let’s look at the XMLA script for creating a partition and some of the key tags you have to be aware of.

<Create
xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"
>

<ParentObject>

<DatabaseID>Adventure Works DW</DatabaseID>

<CubeID>Adventure Works DW</CubeID>

<MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

</ParentObject>

<ObjectDefinition>

<Partition
xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>

<ID>Internet_Sales_2001</ID>

<Name>Internet_Sales_2001</Name>

<Annotations>

<Annotation>

<Name>AggregationPercent</Name>

<Value>20</Value>

</Annotation>

</Annotations>

<Source
xsi:type="QueryBinding"
>

<DataSourceID>Adventure Works DW</DataSourceID>

<QueryDefinition>SELECT * FROM [dbo].[FactInternetSales]

WHERE OrderDateKey &lt;= '184'</QueryDefinition>


</Source>

<StorageMode>Molap</StorageMode>

<ProcessingMode>Regular</ProcessingMode>


<
Slice>[Date].[Calendar].[July 2006]</Slice>

<ProactiveCaching>

<SilenceInterval>-PT1S</SilenceInterval>

<Latency>-PT1S</Latency>

<SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>

<ForceRebuildInterval>-PT1S</ForceRebuildInterval>

<AggregationStorage>MolapOnly</AggregationStorage>

<Source
xsi:type="ProactiveCachingInheritedBinding"
>

<NotificationTechnique>Server</NotificationTechnique>

</Source>

</ProactiveCaching>

<EstimatedRows>1013</EstimatedRows>

<AggregationDesignID>AggregationDesign</AggregationDesignID>

</Partition>

</ObjectDefinition>

</Create>

Above is the create script of one of the partitions from the Adventure Works cube. There are some obvious elements in there but the ones that you need to be aware of when building your script dynamically are as follows;

ID – The most important one. Even though this is the same as the name of the partition there’s a couple of things to remember. Firstly they don’t have to be the same, secondly by default it’s named after the partition name. Why is this important? When you first create a measure group in your Analysis Services project (in Visual Studio) the default partition is named after the measure group name and subsequently the ID. The ID is the unique identifier of a partition within a measure group so if your partitions are named MeasureGroup1_200608, MeasureGroup1_200609 you may find the ID’s are MeasureGroup and MeasureGroup_200609 and processing a partition in XMLA is based on the partition ID.

The partition ID is something you can’t change in an Analysis Services project, well I haven’t found a way yet so in my little world it can’t be changed..! The first thing I would do is create a duplicate partition in the cube design and name it what I would like the ID to be. Once created, you can change the name to whatever you like, the ID won’t be changed.

The Name element becomes pretty arbitrary but you still need to ensure that every partition you create has a different name, again if you want it to be the same as the ID then name it accordingly.

The final two are the QueryDefinition and Slice. This is where you may have to step back and think about the overall design of cubes and their dimension structures. For the purpose of this post I’m talking about partitions being sliced by date, this is because in my experience that’s how most cubes will get sliced. Where it’s suitable I would try to make the key of the date dimension a meaningful value i.e. 20060926 = 26th September 2006. Before we go into multiple discussions about why this is a good or bad idea stop, it’s something I do, I know its pros and cons but I’ve found it to be beneficial in a hell of a lot of occasions, this example being one of them.

My QueryDefinition value will be the query that forms the dataset my partition is to be processed from. Generally you’ll see something with a where clause like “WHERE DateKey BETWEEN 20060900 AND 2006999” if the partition is query bound. If the partition is table bound, more than likely you’re using views to distinguish the different data sets. Finally the partition slice. Even though this is documented as being for information purposes only there have been some other well documented grey areas when not populating it. My recommendation is to put a value in anyway, even if it is for information purposes. The slice value for hmmmm let’s say the September 2006 partition in my cube would look something like this “<Slice>[Date].[Calendar].[September 2006]</Slice>
or ideally <Slice>[Date].[Calendar].&[200609]</Slice>“. The key for the slice makes it clear how you need to be considering this kind of thing when designing your dimensions at the data mart as well as OLAP stage as we’re making the month level of the dimension a critical value.

Looking at those key elements of the XMLA statement for creating a partition we get something like this;

<ID>200609</ID>

<Name>Measure_Group_Name_200609</Name>

<QueryDefinition>

SELECT *

FROM FactTable

WHERE DateKey BETWEEN 20060900 AND 20060999

</QueryDefinition>

<Slice>[Date].[Calendar].&[200609]</Slice>

Each of the elements has the common value of “200609” which can be replaced allowing partitions to be created before processing, in this is example at month end. To actually execute this code within a SSIS package I populate the Source expression of an Analysis Services Execute DDL task. This can be either done via direct input or variable (selected by the SourceType property). The expression takes the variable, or direct input XMLA, and evaluates it based on any parameters you need to input. The parameters in this case would be replacing any reference with 200609 to correctly create the new partition.

Couple of things to remember when creating expression of XMLA like this, the script will have to be in double quotes and any double quotes within the script will have to be replaced with ‘\”’. That’s forward slash and double quote.

Finally, before you run the XMLA it would really be good practice to know if the partition already exists before you attempt to create it. AMO (Analysis Management Objects) is the answer to that. The script below, kindly mixed, matched and cribbed from the those lovely Project REAL chaps, will populate a SSIS boolean variable with the answer to whether the partition exists or not. Don’t forget to add AMO as a reference in the SSIS script task.

Imports System

Imports System.Data

Imports System.Math

Imports Microsoft.SqlServer.Dts.Runtime

Imports Microsoft.AnalysisServices

Public
Class ScriptMain

Public
Sub Main()

Dim WellIsItThere As
Boolean

Dim ASConnection As ConnectionManager

Dim ASServer As
String

Dim ASDatabase As
String

Dim ASCube As
String

Dim ASMeasureGrp As
String

Dim ASPartition As
String

ASPartition = Now().ToString("yyyyMM") ' This will give us “200609”

ASConnection = Dts.Connections("AnalysisServer") ' This is the name of the connection manager that you're going to connect to in the package

ASServer = CStr(ASConnection.Properties("ServerName").GetValue(ASConnection))

ASDatabase = CStr(ASConnection.Properties("InitialCatalog").GetValue(ASConnection))

ASCube = "Adventure Works"
' How you populate this is really up to you

ASMeasureGrp = "Internet Sales"
' And again the population of this is up to you

WellIsItThere = IsPartitionThere(ASServer, ASDatabase, ASCube, ASMeasureGrp, ASPartition)

Dts.Variables("PartitionExists").Value = WellIsItThere

Dts.TaskResult = Dts.Results.Success

End
Sub

Public
Function IsPartitionThere(ByVal sServer As
String, _

ByVal sDatabase As
String, ByVal sCube As
String, _

ByVal sMeasureGroup As
String, _


ByVal sPartition As
String) As
Boolean

' By default, we will assume that it isn't there

' Only if we get all of the way to the end and everything is

' found, will we set it true

Dim bIsPartitionThere As
Boolean =
False

Dim oServer As
New Microsoft.AnalysisServices.Server


' connect to the server and start scanning down the

' object hierarchy

oServer.Connect(sServer)

Dim oDB As Database = oServer.Databases.FindByName(sDatabase)

If oDB Is
Nothing
Then

MsgBox("Did not find expected database: " & sDatabase, _

MsgBoxStyle.OkOnly, "Error looking for partition")


GoTo Done


Else


Dim oCube As Cube = oDB.Cubes.FindByName(sCube)


If oCube Is
Nothing
Then

MsgBox("Did not find expected cube: " & sCube, _

MsgBoxStyle.OkOnly, "Error looking for partition")


GoTo Done


Else


Dim oMG As MeasureGroup = _

oCube.MeasureGroups.FindByName(sMeasureGroup)


If oMG Is
Nothing
Then

MsgBox("Did not find expected Measure Group: " & _

sMeasureGroup, _

MsgBoxStyle.OkOnly, _


"Error looking for partition")


GoTo Done


Else


'-- This is the real test -- to see if the partition


' is really there


Dim oPart As Partition = _

oMG.Partitions.Find(sPartition)


If
Not oPart Is
Nothing
Then


' This is the only place to set the value to TRUE

bIsPartitionThere = True


End
If


End
If


End
If


End
If

Done:

oServer.Disconnect() ' disconnect from the server -- we are done

NoConnection:

Return bIsPartitionThere

End
Function

End
Class

Monday, September 18, 2006

This Week I am Mostly Listening To...



Captain - This is Hazelville

Tuesday, September 12, 2006

Simple SSIS Logging - Part 2, Devil in the Detail

Ok so now you know when a package process starts, when it finishes and its final status (See Part 1). What we don’t have is the ability to accurately analyse the individual steps within a package and make effective decisions based on the information available.

First you will have to decide what you are looking to track within the system. Obviously errors are the most important element as fast resolution is essential. Next should be warnings. The warnings generated within SSIS are key to identifying areas that may have performance issues. A common warning in SSIS would be where a column is brought into from a data source but subsequently not used throughout the package. SSIS will raise a warning at every transformation informing a user that these unused columns may hinder performance but does not effect the final outcome of the package.

For these examples we’re writing our log to a database so it’s necessary to have a structure to hold it. A nice simple structure should let you know the type of alert, timestamp, source task and any other descriptive information (such as error code and error description).

When you have the table structure you can write an insert statement, the most sensible option is of course a stored procedure with a number of parameters serving your needs, something like this;

usp_WriteSSISEvent @ServerName, @EventType, @SourcePackage, @SourceTask, @ErrorCode, @ErrorDesc, @TimeStamp

So how do these variables get populated? Well for the most control on what gets written and where (and it’s pretty simple to boot, hence the title), I’m first going to put an Execute SQL task in to the OnError event handler. After setting the connection manager set the SQL statement via an expression. If you’ve not done this before it’s a very useful thing to know about but to cut a long story short it’s possible to set any number of task properties via an expression.

The expression will effectively be the constructed SQL statement with your parameters populated via a combination of system variables and direct input. Here’s The statement for an OnError event based on the above stored procedure example.

"EXECUTE [usp_WriteSSISEvent]

@ServerName = '" + @[System::MachineName] + "'

,@EventType = '" + "OnError" + "'

,@SourcePackage = '" + @[System::PackageName] + "'

,@SourceTask = '" + @[System::SourceName] + "'

,@ErrorCode = " + (DT_WSTR, 15)@[System::ErrorCode] + "

,@ErrorDescription = '" + @[System::ErrorDescription] + "'

,@EventTimeStamp = '" + (DT_WSTR, 30)GETUTCDATE() + "'"

Yeah I know it looks a bit of a mess but trust me, this works .

The pain with this method is it’s not possible to populate the @EventType via a system variable meaning that you have to populate it manually for each event type you wish to record. The other gotcha is that the system variables differ dependent on the event type. For example of the 4 I usually use, OnError, OnWarning, OnPostExecute and OnPreExecute, only errors and warning have the error code and description. Something to be aware of.

If you want to see it working then I have an example project all ready to send, just mail me at mailto:steve@stevemchugh.com?subject=Simple%20SSIS%20Logging%20Part%20Deux and I’ll send you a copy. It has the database structure as a script so there’s not too much effort involved in having a look.