Thoughts of a Freelance Dolphin Trainer: May 2005

Thursday, May 26, 2005

Ouch...!

Liverpool (whom I am not a regular fan of) won the Champions League final in what I must admit was pretty spectacular circumstances last night. I watched it in a pub and to say I feel pretty spectacular this morning would be wildly off target.

Now where are those asprin?

Sunday, May 22, 2005

Jesus Was a Carpenter..!

I have spent most of my day today with MDF (not that sort of MDF) drills and nails. Lots of hard work and something presentable in the end. Nothing special, just boxing in some pipes at home but I actually quite enjoyed it.
A friend of mine stayed over the weekend and as he is at one with the ways of the chippy, I assisted him completing a task at home that has been on my agenda for some weeks. It does make you wonder about the whole sitting on your backside for 10-12 hours a day looking at a screen, surely it can't be good for you.
After the whole IT bubble has burst and most of us geeks have been thrown to the Wolves it's either going to be a Yak farming job in Tibet or something practical like today. Until then........

Thursday, May 19, 2005

Bringing Balance to the Force

The eternal debate. Was Darth Vadar the chosen one and did he bring balance to the force. Personally I am of the opinion that he did, he just did it 20 odd years too late for the republic and Obi Wan's liking.

I've just seen the latest "film" in the Star Wars series. There are plenty of reviews out there, possibly even a non-biased one but here are my observations.

1. R2D2, what have they done to you. You now hop out of a ship like a demented jack rabbit with a blaster up your exhaust pipe. It's wrong, plain wrong.

2. Close your eyes and Hayden Christensen sounds like a character from Point Break. Bodacious master.

3. Solo, Yeah Han Solo, Remember him, no, never mind.

4. Take Endor, add a bit of water, pump the Ewoks full of steroids, you have Cheweys home planet.

5. Ian McDiarmid. This man is a god and by far the best thing in the George's "epics".

The subject will not be spoken of again........for a while.....maybe.

Sunday, May 15, 2005

This Week I am Mostly Listening To...

Hot Hot Heat - Make Up The Breakdown

Thursday, May 12, 2005

SQL PASS 2005 - New Cube Design Features - Part 1 - Dimensions

The next session I visited was based around new cube design features in Analysis Services 2005 presented by the MDX guru that is Chris Webb. Now I’m one of these people who never initially looked at the instructions preferring to get started and find my own way around a new product. This has been a trait of mine since I was a kid with things like Transformers (robots in disguise for the less well informed) and other such youthful pursuits. Just over a year ago after first launching the development environment for Analysis Services and taking my first tentative steps around the cube design options I had to re-think that attitude a touch.

Analysis Services 2005 is way different. Not only in the way you develop the cubes but the way are eventually constructed, how you communicate with them but also how the end user will view them. Chris’ presentation focused on a subset of the new features focusing on the elements that are fundamental to pretty much every existing implementation of Analysis Services. Making a direct comparison to elements in the current version of AS is difficult but Chris communicated the relationships between the old and the new incredibly clearly.

To the features. The biggest change in the product is around the concept of dimensions. From a logical database perspective a dimension can be classified as a bunch of related attributes that when grouped form a table. This not exactly how a dimension behaves in Analysis Services 2000 as a dimension is essentially treated has a hierarchical object whether it contains single or multiple levels. You then add properties to the levels etc. for added informational value. In 2005 the structure of a dimension follows, in my personal opinion, a purer logical format. A dimension is treated as a collection of attributes and then presentation views of that dimension are separately applied as hierarchies. Separating the dimensions core attributes from the hierarchies is a great move meaning a tremendous amount of flexibility on how data can be presented to the user but without complicating the dimension relationship with a fact.

It’s going to interesting to see how the reporting client vendors deal with such a fundamental change to the cube structure. Now getting a hierarchy to display and getting a hierarchy to perform are just as important as they are now. This is achieved using our memory hungry little friends, member properties. The member properties are now the linkage between levels within a hierarchy. Chris’ example almost word for word was this; If there is a one to many relationship between attributes State and City, State will be a member property of City. A very important thing to remember is that the dimension wizard will not build most of these member properties meaning there is still plenty of work to do, certainly in this release, before a cube is what you may term as ‘production ready’.

I’ve decided I’m going to have to split these posts as they’re getting too large…..

SQL PASS 2005 - Reporting Services Enhancements

Ah Reporting Services. Both a blessing and curse to me in recent times. The first thing a customer always asks me to do with this tool is something it cannot do out of the box. This isn't a problem really but it can be time consuming when you're creating your suite of reports for a client (say 2 dozen) and they decide that they really would like to be able to sort one of the columns (at the end of development). Fortunately Microsoft has taken a hell of a lot on board since the first version of Reporting Services saw the light of day and features that made this an incredibly powerful reporting client have been enhanced 2 fold. Firstly as an improved product in its own right but secondly when combined with 2005 Analysis Services one of the best Microsoft OLAP client reporting tools on the market.

When people saw the first version of SQL Server 2000 Reporting Services the first thing said was probably something along the lines of "where's the MMC snap-in?". Visual Studio has a vastly better environment for this kind of development and thank the lord MS decided to continue this throughout the toolset (que sandals and beards being thrown from the old skool, burn him, BUUURRNN HIMMMM..!). Developers know what they are going to get with Visual Studio and people in the BI world and DBA are going to get more benefit from an intuitive environment that stabilises and speeds up dev time tremendously.

OK less on the environment and more on the presentation and product. Dave Wickert from the SQL Server product unit was the Microsoft employee working the crowd for this one. I managed to speak to him briefly on the evening before and when he said all of his demonstration were to based on OLAP data the words 'Christ' and 'Thank' may have slipped out. I’m never going to be able to go over all the enhancements but I can certainly give an overview of the key areas that will be big in this version.

Right, leave here and go and write some MDX for an OLAP data source then stick that in a matrix, after you’ve watched some paint dry look at your report. OK that’s not at all fair but the current version was not as responsive as I certainly would have liked when it comes to multi-dimensional data. Trying to get the correct data to appear depending on what axis you put the dimension then the subtleties of the order the dimensions were on the axis was a pain. On a scale of 1-10, 10 being child birth, where would you place MDX on multiple hierarchies for display in a flattened dataset? I may use that as an interview question.

2005 Reporting Services with 2005 Analysis Services……….fixed. How the new version of RS interacts with AS 2000 I shall be doing some investigation on.

The next thing I want to mention are the little things. Those featurettes that when Dave mentioned them brought a wave of positive emotion to the room. Multiple selections on parameters, column sorting by measure or header, server trip for drill down (need to get some confirmation on this one but it has massive implications in low bandwidth scenarios), deployment via SQL Management Studio, floating column headers and so on and so on. These are the areas where users have been listened to which in itself is very refreshing, and I know from experience that some of these features could have saved me days of development in recent times.

Lastly is a very big new feature which I was lucky enough to have a glance at early last year and that is the Report Builder. This new tool is driven from the pre-defined data model and deployed to the Report Server for the users. What does this mean? Well if you’re familiar with Business Objects Universes then think of it as similar, sort of. You take a model, create relationships, calculated fields, etc. and put it on the server. Bob from accounts doesn’t need to understand one record from the Invoice table has a one to many relationship with records in the Invoice Line table, he just sees that he can select Invoice and Invoice Line from a list and drag it on to a report layout of his choice.

The tool has a few more tricks up its’ sleeve and I’ll leave the statements “an understanding of the semantic model” and “infinite drill” for another time. Microsoft are quick to state that this is not a replacement for the plethora of reporting tools out there but an extension of the interface. Either way it should open Reporting Services to another user community which for Business Intelligence is nothing but good.

SQL PASS 2005 - SSIS Performance

As I'm sure you're aware by now SQL Server Integration Services (SSIS) is the new version of Data Transformation Services in SQL Server 2005. This is not a simple upgrade but a complete re-architecting of the product bringing it almost in line with other enterprise ETL tools.
The presentation from Grant Dickinson of Microsoft and Henk van der Valk of Unisys was a demonstration, from my perspective, not of the new features but the capabilities of the product in the area of scalability. SSIS likes memory and that was made particularly obvious by the charts and graphs we were shown that showed the 64bit server wiping the floor with the 32bit (as you would expect) and in some cases the 32bit server not even completing the process after hitting its 3Gb process limit.
Another key aspect of the improvements in performance is SSIS's ability to parallelise. You could always get DTS to do this but it was never pleasent. SSIS can now scale up incredibly well and again we were shown screenshots of the processor workload during a package run and it was certainly utilising whatever power it could get hold of.
note: I'm one of these people that tends to believe a screenshot rather than not (ever the optimist).
Any ETL fans out there are going to find plenty that is very familar in their world. The components are standard ETL fare (merge, lookup, etc.) but the one that floats my boat? Slowly Changing Dimension, oh yes. All they need to do now is get the thing cluster / scale out and the top end ETL tools will be well within Microsoft's sights.

Wednesday, May 11, 2005

Database snapshots

Database snapshots are very cool. I think the only problem with them is that the possible uses are so numerous that I don’t think people have quite decided yet what they are best for. One possibility is that they are going to be great for all conceivable uses but the danger is that they will get misused and then all kinds of things will happen. The technology used for database snapshots does not belong to SQL Server but Windows so there are a number of restrictions. The first thing to remember is that a database snapshot behaves like any other database but, of course, is read only. You can select from it, join its tables even restore a database to the point in the snapshot was taken which sounds great but then you think point in time copies of the database will be huge. Have a number of them and you server is going to run out of disk very quickly but if you take a snapshot of a database and then go and checkout the size of the data file in explorer you get a surprise. Even though the file states its size as, lets say, 2Gb, the same as the original data file of the database the snapshot was taken from, its size on disk will only be 64Kb.

Database snapshot use a copy-on-write mechanism via NTFS sparse files (NTFS, note the disk technology being used, very important).This means that the snapshot only contains the pages that have changed on the database. If I went and updated 100Mb of data in my database then my snapshot would increase in size by 100Mb due to the updated pages being copied into the snapshot. Also this is done before the data is committed so rolling back has no effect on what has been written to the snapshot. Very helpful and that is in its essence what a snapshot is, not a replacement for backups, not a way of retaining a copy of historical data for reporting purposes (although I can almost certainly see it getting used that way) but it is a good safety net when making data changes manually or even procedurally. The important thing is that it doesn’t promote lazy administration and hopefully the restrictions on this new feature (and there are plenty of them) should ensure that.

Table partitioning

I’m never going to explain this subject in more detail than the experts so I might as well go the other way on this and make it brief. In SQL Server 2000 the concept of partitioned views was introduced which along with the introduction of INSTEAD OF triggers gave users the ability to effectively deliver data into separate tables whilst retaining a single point of entry. If I cast my mind back a few years I think this was the same technique used to get SQL Server on top of the TPC benchmarks. In SQL Server 2005 you now have partitioned tables and the generation and usage around it is pretty funky to be honest. Partitioned tables are created in three stages. First is the creation of the partition function which effectively is a statement of the data range and definition of the boundaries within the range (e.g. before Jan, Jan, Feb, Mar, After Mar). Key to the statement are the LEFT or RIGHT designation which rather than attempt to badly clarify its nuances there’s an excellent posting on Kimberly Tripp’s Blog that delivers a lot of useful information on this subject.
After the partition function comes the partition scheme that maps the partitions (whose number is defined by the function) to the filegroups and database. When the partition function and scheme are in place then a partitioned table can be built in the following syntax;

CREATE TABLE <table_name>
(
<column_list>
) ON <partion_scheme>(<bound_column>)

The great thing is that you can re-use the partition function again on another table if that’s how you want to partition the table. For the majority of the OLAP work I do that is almost always going to be the case and I would partition my tables by time just as I would partition cubes I built by the time dimension.

Fast file initialisation

Historically creating database files has always been a chore. Cast you mind back to SQL Server 6.5 and without vigilant monitoring (AKA a DBA doing their job correctly) you would get bitten on the backside when either the database or transaction log device filled up. There were some techniques for avoiding this but plenty of people didn't bother and just manually extended. Enter the realms of the modern day versions of SQL Server where this had been made a lot easier but the performance impact of a database expanding itself by 10% or X amount of Mb throughout a long hungry process became apparent so for critical systems you would end up doing similar things to what was done on 6.5. Now 2005.
Primary, secondary data and log files on SQL Server 2005 can be initialised almost instantly allowing for fast execution of tasks such as adding files to databases or increasing the size of an existing data file. Traditionally files were initialised to grab disk from previously deleted files. These files were initialised by first filling them with zeros, now this doesn’t happen and disk content is overwritten as new data is applied to the files.
There are a couple of caveats though. Firstly the Windows account the SQL Server service is running under must either be an administrator of the machine or have SE_MANAGE_VOLUME_NAME special privilege assigned to it. Secondly and perhaps more of an issue is that fast file initialization is only available on Windows XP or 2003 server systems. There are a lot of features in SQL Server 2005 that only become usable when on a XP or 2003 operating system but it’s likely the users who will adopt SQL Server 2005 are the ones with the latest server technologies so hopefully this shouldn’t prove so much of an issue.

SQL PASS 2005 - DBA, The Next Generation

Pre-conference Seminar - The next generation DBA with Kimberly Tripp.
Although I haven't done any hardcore DBAing for a few years I couldn't resist going to see Kimbery's presentation especially with a reputation like hers. She didn't dissapoint and i don't think anyone went away without learning at least one very useful feature in SQL Server 2005.
For me I can pick out specific areas I know I'll be using in the next few weeks. I'm going to do a post on each in no particular order and go into a bit more detail (well try to go into a bit more than Kimberly if that's at all possible).

Wednesday, May 04, 2005

Blogging on the Rails

Finally I have the ability to blog on the move. This has traditionally been the reason for my more sporadic attempts at blogging over recent times but thanks to Pocket Blogger I can now let the world entertain my wildest (?) thoughts.

Hmmmm. Saturated Fats..!

Sustaining oneself on fast food is really not the best of ideas when you're trying to master complex feats of development but then again Starbucks wasn't open so I was kind of restricted to any of the vast selection of neon lit establishments around Liverpool street.
Anyway the cube is built and the first of several reports is complete all in one neatly packaged solution. Processing speed is good and I've managed to get myself over a million rows of decent demo data thanks to a great little tool called the Advanced Data Generator. I've used it in the past and it has always delivered what I needed so take note if you're stuck for several million rowsd of data.

Tuesday, May 03, 2005

Art of the Referentially Intact

Clicking a wizard should be the easiest thing in the world. One push, out comes your desired result. Not tonight. I'm having great difficulty building a demo SQL Server 2005 Analysis Services cube because the data I've been given has very little integrity. So what does this mean for me? It means I'm missing the champions league semi final between Liverpool and Chelsea. It means I'm getting hungrier by the second and will probably have to consume a large amount of processed food on the way home and it also means I'm losing the feeling in my fingers due to the over active air conditioning in my office.
Hopefully I would have now cleansed the data to a point at which it can't fail but it severly reduces what I can demonstrate. Lets see what spanners await me when I try to build some Reporting Services reports off of it.....!

Sunday, May 01, 2005

This Week I am Mostly Listening To...

Bloc Party - Silent Alarm

Thoughts of a Freelance Dolphin Trainer

Thursday, May 26, 2005

Ouch...!

Sunday, May 22, 2005

Jesus Was a Carpenter..!

Thursday, May 19, 2005

Bringing Balance to the Force

Sunday, May 15, 2005

This Week I am Mostly Listening To...

Thursday, May 12, 2005

SQL PASS 2005 - New Cube Design Features - Part 1 - Dimensions

SQL PASS 2005 - Reporting Services Enhancements

SQL PASS 2005 - SSIS Performance

Wednesday, May 11, 2005

Database snapshots

Table partitioning

Fast file initialisation

SQL PASS 2005 - DBA, The Next Generation

Wednesday, May 04, 2005

Blogging on the Rails

Hmmmm. Saturated Fats..!

Tuesday, May 03, 2005

Art of the Referentially Intact

Sunday, May 01, 2005

This Week I am Mostly Listening To...

About Me

Twitter Updates

Twitter Updates

Categories

BI Links

Other Links

Blog Archive

Thursday, May 26, 2005

Sunday, May 22, 2005

Thursday, May 19, 2005

Sunday, May 15, 2005

Thursday, May 12, 2005

Wednesday, May 11, 2005

Wednesday, May 04, 2005

Tuesday, May 03, 2005

Sunday, May 01, 2005

Subscribe

About Me

Twitter Updates

Twitter Updates

Categories

BI Links

Other Links

Blog Archive