Thursday, May 26, 2005
Ouch...!
Now where are those asprin?
Sunday, May 22, 2005
Jesus Was a Carpenter..!
A friend of mine stayed over the weekend and as he is at one with the ways of the chippy, I assisted him completing a task at home that has been on my agenda for some weeks. It does make you wonder about the whole sitting on your backside for 10-12 hours a day looking at a screen, surely it can't be good for you.
After the whole IT bubble has burst and most of us geeks have been thrown to the Wolves it's either going to be a Yak farming job in Tibet or something practical like today. Until then........
Thursday, May 19, 2005
Bringing Balance to the Force
I've just seen the latest "film" in the Star Wars series. There are plenty of reviews out there, possibly even a non-biased one but here are my observations.
1. R2D2, what have they done to you. You now hop out of a ship like a demented jack rabbit with a blaster up your exhaust pipe. It's wrong, plain wrong.
2. Close your eyes and Hayden Christensen sounds like a character from Point Break. Bodacious master.
3. Solo, Yeah Han Solo, Remember him, no, never mind.
4. Take Endor, add a bit of water, pump the Ewoks full of steroids, you have Cheweys home planet.
5. Ian McDiarmid. This man is a god and by far the best thing in the George's "epics".
The subject will not be spoken of again........for a while.....maybe.
Sunday, May 15, 2005
Thursday, May 12, 2005
SQL PASS 2005 - New Cube Design Features - Part 1 - Dimensions
Analysis Services 2005 is way different. Not only in the way you develop the cubes but the way are eventually constructed, how you communicate with them but also how the end user will view them. Chris’ presentation focused on a subset of the new features focusing on the elements that are fundamental to pretty much every existing implementation of Analysis Services. Making a direct comparison to elements in the current version of AS is difficult but Chris communicated the relationships between the old and the new incredibly clearly.
To the features. The biggest change in the product is around the concept of dimensions. From a logical database perspective a dimension can be classified as a bunch of related attributes that when grouped form a table. This not exactly how a dimension behaves in Analysis Services 2000 as a dimension is essentially treated has a hierarchical object whether it contains single or multiple levels. You then add properties to the levels etc. for added informational value. In 2005 the structure of a dimension follows, in my personal opinion, a purer logical format. A dimension is treated as a collection of attributes and then presentation views of that dimension are separately applied as hierarchies. Separating the dimensions core attributes from the hierarchies is a great move meaning a tremendous amount of flexibility on how data can be presented to the user but without complicating the dimension relationship with a fact.
It’s going to interesting to see how the reporting client vendors deal with such a fundamental change to the cube structure. Now getting a hierarchy to display and getting a hierarchy to perform are just as important as they are now. This is achieved using our memory hungry little friends, member properties. The member properties are now the linkage between levels within a hierarchy. Chris’ example almost word for word was this; If there is a one to many relationship between attributes State and City, State will be a member property of City. A very important thing to remember is that the dimension wizard will not build most of these member properties meaning there is still plenty of work to do, certainly in this release, before a cube is what you may term as ‘production ready’.
I’ve decided I’m going to have to split these posts as they’re getting too large…..
SQL PASS 2005 - Reporting Services Enhancements
When people saw the first version of SQL Server 2000 Reporting Services the first thing said was probably something along the lines of "where's the MMC snap-in?". Visual Studio has a vastly better environment for this kind of development and thank the lord MS decided to continue this throughout the toolset (que sandals and beards being thrown from the old skool, burn him, BUUURRNN HIMMMM..!). Developers know what they are going to get with Visual Studio and people in the BI world and DBA are going to get more benefit from an intuitive environment that stabilises and speeds up dev time tremendously.
OK less on the environment and more on the presentation and product. Dave Wickert from the SQL Server product unit was the Microsoft employee working the crowd for this one. I managed to speak to him briefly on the evening before and when he said all of his demonstration were to based on OLAP data the words 'Christ' and 'Thank' may have slipped out. I’m never going to be able to go over all the enhancements but I can certainly give an overview of the key areas that will be big in this version.
Right, leave here and go and write some MDX for an OLAP data source then stick that in a matrix, after you’ve watched some paint dry look at your report. OK that’s not at all fair but the current version was not as responsive as I certainly would have liked when it comes to multi-dimensional data. Trying to get the correct data to appear depending on what axis you put the dimension then the subtleties of the order the dimensions were on the axis was a pain. On a scale of 1-10, 10 being child birth, where would you place MDX on multiple hierarchies for display in a flattened dataset? I may use that as an interview question.
2005 Reporting Services with 2005 Analysis Services……….fixed. How the new version of RS interacts with AS 2000 I shall be doing some investigation on.
The next thing I want to mention are the little things. Those featurettes that when Dave mentioned them brought a wave of positive emotion to the room. Multiple selections on parameters, column sorting by measure or header, server trip for drill down (need to get some confirmation on this one but it has massive implications in low bandwidth scenarios), deployment via SQL Management Studio, floating column headers and so on and so on. These are the areas where users have been listened to which in itself is very refreshing, and I know from experience that some of these features could have saved me days of development in recent times.
Lastly is a very big new feature which I was lucky enough to have a glance at early last year and that is the Report Builder. This new tool is driven from the pre-defined data model and deployed to the Report Server for the users. What does this mean? Well if you’re familiar with Business Objects Universes then think of it as similar, sort of. You take a model, create relationships, calculated fields, etc. and put it on the server. Bob from accounts doesn’t need to understand one record from the Invoice table has a one to many relationship with records in the Invoice Line table, he just sees that he can select Invoice and Invoice Line from a list and drag it on to a report layout of his choice.
The tool has a few more tricks up its’ sleeve and I’ll leave the statements “an understanding of the semantic model” and “infinite drill” for another time. Microsoft are quick to state that this is not a replacement for the plethora of reporting tools out there but an extension of the interface. Either way it should open Reporting Services to another user community which for Business Intelligence is nothing but good.
SQL PASS 2005 - SSIS Performance
The presentation from Grant Dickinson of Microsoft and Henk van der Valk of Unisys was a demonstration, from my perspective, not of the new features but the capabilities of the product in the area of scalability. SSIS likes memory and that was made particularly obvious by the charts and graphs we were shown that showed the 64bit server wiping the floor with the 32bit (as you would expect) and in some cases the 32bit server not even completing the process after hitting its 3Gb process limit.
Another key aspect of the improvements in performance is SSIS's ability to parallelise. You could always get DTS to do this but it was never pleasent. SSIS can now scale up incredibly well and again we were shown screenshots of the processor workload during a package run and it was certainly utilising whatever power it could get hold of.
note: I'm one of these people that tends to believe a screenshot rather than not (ever the optimist).
Any ETL fans out there are going to find plenty that is very familar in their world. The components are standard ETL fare (merge, lookup, etc.) but the one that floats my boat? Slowly Changing Dimension, oh yes. All they need to do now is get the thing cluster / scale out and the top end ETL tools will be well within Microsoft's sights.
Wednesday, May 11, 2005
Database snapshots
Database snapshots are very cool. I think the only problem with them is that the possible uses are so numerous that I don’t think people have quite decided yet what they are best for. One possibility is that they are going to be great for all conceivable uses but the danger is that they will get misused and then all kinds of things will happen. The technology used for database snapshots does not belong to SQL Server but Windows so there are a number of restrictions. The first thing to remember is that a database snapshot behaves like any other database but, of course, is read only. You can select from it, join its tables even restore a database to the point in the snapshot was taken which sounds great but then you think point in time copies of the database will be huge. Have a number of them and you server is going to run out of disk very quickly but if you take a snapshot of a database and then go and checkout the size of the data file in explorer you get a surprise. Even though the file states its size as, lets say, 2Gb, the same as the original data file of the database the snapshot was taken from, its size on disk will only be 64Kb.
Database snapshot use a copy-on-write mechanism via NTFS sparse files (NTFS, note the disk technology being used, very important).This means that the snapshot only contains the pages that have changed on the database. If I went and updated 100Mb of data in my database then my snapshot would increase in size by 100Mb due to the updated pages being copied into the snapshot. Also this is done before the data is committed so rolling back has no effect on what has been written to the snapshot. Very helpful and that is in its essence what a snapshot is, not a replacement for backups, not a way of retaining a copy of historical data for reporting purposes (although I can almost certainly see it getting used that way) but it is a good safety net when making data changes manually or even procedurally. The important thing is that it doesn’t promote lazy administration and hopefully the restrictions on this new feature (and there are plenty of them) should ensure that.
Table partitioning
After the partition function comes the partition scheme that maps the partitions (whose number is defined by the function) to the filegroups and database. When the partition function and scheme are in place then a partitioned table can be built in the following syntax;
CREATE TABLE <table_name>
(
<column_list>
) ON <partion_scheme>(<bound_column>)
The great thing is that you can re-use the partition function again on another table if that’s how you want to partition the table. For the majority of the OLAP work I do that is almost always going to be the case and I would partition my tables by time just as I would partition cubes I built by the time dimension.
Fast file initialisation
Primary, secondary data and log files on SQL Server 2005 can be initialised almost instantly allowing for fast execution of tasks such as adding files to databases or increasing the size of an existing data file. Traditionally files were initialised to grab disk from previously deleted files. These files were initialised by first filling them with zeros, now this doesn’t happen and disk content is overwritten as new data is applied to the files.
There are a couple of caveats though. Firstly the Windows account the SQL Server service is running under must either be an administrator of the machine or have SE_MANAGE_VOLUME_NAME special privilege assigned to it. Secondly and perhaps more of an issue is that fast file initialization is only available on Windows XP or 2003 server systems. There are a lot of features in SQL Server 2005 that only become usable when on a XP or 2003 operating system but it’s likely the users who will adopt SQL Server 2005 are the ones with the latest server technologies so hopefully this shouldn’t prove so much of an issue.
SQL PASS 2005 - DBA, The Next Generation
Although I haven't done any hardcore DBAing for a few years I couldn't resist going to see Kimbery's presentation especially with a reputation like hers. She didn't dissapoint and i don't think anyone went away without learning at least one very useful feature in SQL Server 2005.
For me I can pick out specific areas I know I'll be using in the next few weeks. I'm going to do a post on each in no particular order and go into a bit more detail (well try to go into a bit more than Kimberly if that's at all possible).
Wednesday, May 04, 2005
Blogging on the Rails
Hmmmm. Saturated Fats..!
Anyway the cube is built and the first of several reports is complete all in one neatly packaged solution. Processing speed is good and I've managed to get myself over a million rows of decent demo data thanks to a great little tool called the Advanced Data Generator. I've used it in the past and it has always delivered what I needed so take note if you're stuck for several million rowsd of data.
Tuesday, May 03, 2005
Art of the Referentially Intact
Hopefully I would have now cleansed the data to a point at which it can't fail but it severly reduces what I can demonstrate. Lets see what spanners await me when I try to build some Reporting Services reports off of it.....!