Friday, September 30, 2005
A Storms a Brewing
So what more can I say but watch this space............
Tuesday, August 23, 2005
Back to the slog
You've got a dimension, say a time dimension that has levels of fiscal year, fiscal period and fiscal year. You have 2 cubes, one that looks at data at week level and another looking at period level for one reason or another. On many occasions I've seen a second time dimension added for the period level cube (or similar scenario) when all that needs to be done is disable the fiscal week level in the time dimnesion in the period cube.
It's pretty easy to find, in the cube editor open up the dimnesion whose level you want to disable, highlight the level and in the advanced settings set 'Disabled' to Yes and there you have it, one less dimension, member properties and performance hit to worry about.
Tuesday, August 16, 2005
Monday, August 01, 2005
Normal Service Has Been Resumed
Wednesday, July 27, 2005
Monday, July 18, 2005
Wednesday, July 13, 2005
Tuesday, July 12, 2005
2D 3D
To try and make my life a little easier I decided to spend a little time working on a way of easily tweaking each report to minimise the amount of work necessary to complete the report suite.
Bit of background, the report I'm working on has 3 alternate hierarchies in a store dimension which need to be drillable (whilst displaying different subreports at each differing level of each alternate hierarchy). On another axis is a crossjoin of a subset of the measures and what could be best compared to as a product dimension. A better explanation maybe would be to try and visualise the report....
Standard matrix with the first row group displaying a list of the children of the current store hierarchy parameter value. Row group 2 is a list of the children of current product hierarchy parameter unioned with a list of measures which effectively gives you the sales value for product sub categories A B and C and then things like number of stores and number of units sold for the parent category of A B and C.
Finally the column group was driven from a named set containing the previous 14 fiscal periods. As I previously said the report was drillable by store but also by product. This meant that a single report to the user was in fact built up of 40 reports linked and passing parameters between themselves to cater for every level in every dimension multiplied by 3 due to the alternate hierarchies.
What I ended up doing was removing the 'database' fields from the matrix and replacing them with calculated fields meaning I didn't really have to touch the matrix for the changes but only the calculated field for each separate report.
This may seem like going the long way around but for this requirement and this set of data this was quite literally the only solution and meant I was producing reports in several minutes along with thorough testing. Anyway I just thought I would share that one.
Monday, July 11, 2005
Thursday 7th July
After so many highs it's difficult to conceive how low things can get. Much as I doubt that anyone this statement is directed at will ever see this I just wanted to give my condolences to all the friends and family of the people whose lives were lost because of Thursday’s event.
Monday, July 04, 2005
Sunday, June 26, 2005
Thursday, June 23, 2005
MIssing a Trick
I know of at least two cases where Microsoft were in a good position to implement a BI solution in companies that were effectively wall to wall Oracle and failed due to their insistence on predominately pushing the database server and not the other tools in the SQL Server suite. Think about it, I have no centralised reporting, no in house ability to perform slice and dice analysis but what I do have is a shed load of Oracle Systems and the DBA’s and administrators that go with it. Even if the data warehouse is a new build you’re not going to ignore the skills you have in house but I think in these cases Microsoft has.
An enterprise Analysis Services license is going to cost the same as the SQL Server database engine so push the market leading OLAP product and get the money that way. Maybe try pushing a great reporting tool like Reporting Services in addition to Analysis Services, an even more attractive proposition. If the organisation is open to bringing in Microsoft server tools but Oracle is the database of choice, market yourselves in that company by first showing what that product can do. This is probably going to be more the case when 2005 is releases and SSIS comes into the fold.
In the meantime, where the situation arises that Oracle / Microsoft mixed implementations are the best for an organisations particular situation I will carry on recommending it until something changes my mind.
Tuesday, June 21, 2005
Project '4' REAL
There's a couple of sites, one is the original technical overview here by Len Wyatt and now the Project REAL website which is going to be an ever expanding information resource on the progress of the project. There is already a link to a piece on SSIS detailing lessons learnt during the ETL design process and is probably essential viewing for anyone with an interest in the ETL space.
I'm lead to believe there will be a number of webcasts as well as the content available on the site which will increase ramping up more and more the closer it gets to the 2005 launch date (November currently and looking certain) so I'm sure there will be something for everyone to be interested in but if you're looking for reporting services info expect it to be towards the end of the project :)
Thursday, June 16, 2005
Diet Projects
Good tool this, very good and with the help of an expert I know who I’m trying to convince to start blogging it’s going quite well. But, there’s always a “but” pulling Oracle data out with SSIS has proved trickier than it shouldn’t have been. Yes I know I’ve cut a few corners to speed the process up but numerical data appears to be coming out as numeric(38,4) and trying to any data conversion to other numerical data using the derived column component is causing it to fail consistently. I’ve managed to get it working but I have to do something pretty evil and convert the data. First to a string in the select statement then back into decimal in the derived column component, the overhead is obvious but I should only have to do this once and when I’m all done I think I’ll reinvestigate all the little irritations I’m finding.
I'm sure i'm just missing something that a little preperation couldn't solve but it's still a somewhat odd.
Wednesday, June 15, 2005
Monday, June 06, 2005
Service Pack 4 Weirdness
The server is sitting on version 8.0.2026.0 msolap80.dll (This I believe is the beta for service pack 4 but the behaviours here have been recreated in the release version of service pack 4 which I’m told is 8.0.2029). Two clients one on service pack 3a (8.0.760) and the other on service pack 4 release.
Create a calculated member in the Warehouse cube on the Foodmart 2000 database called warehouse count using…
Distinctcount([Warehouse].[Warehouse Name].members)
No problems, you will be able to get the correct values out on either version of the clients msolap80.dll. Import the calculated member or recreate it on the virtual cube built from the Warehouse cube that has just the dimensions and measures copied in from the physical.
Look at the calculated member in the virtual cube with a client on 8.0.2026 and you get nothing returned, look at it with a service pack 3a client you will get the correct value. Now this may not seem like much of a problem when you haven’t or have no intention of deploying service pack 4 PTSlite to clients but if you have Analysis and Reporting Services pulling data out of the cubes on the same machine then you will be using Service Pack 4 PTS.
This has caused me a lot of problems and even to a certain extent caused me to have to redesign some of the cubes. We could role back the service pack but in this situation the querying and processing stability issues resolved were too important to overlook. Suffice to say we are now building VMWare images to allow us to test this kind of deployment in future without actually deploying.
If anyone else gets a different result or has any similar experiences then I would be really interested to hear about them.
Wednesday, June 01, 2005
New Cube Design Features - Part 2 - More on Hierarchies
The last thing I mentioned was around relationships between attributes in hierarchies being cemented with member properties. 2005 also understands transitive relationships, again plagiarising Chris’ presentation and example; If there is a one to many relationship between City and Store and City is a member property of Store, Analysis Services automatically infers the relationship of State and Store, Clever huh?
Member properties can express two types of relationship, rigid and flexible. Rigid relationships will retain aggregations when the dimension is updated and is used for relationships between members that do not change over time. Flexible relationships can change over time such as in a slowly changing dimension. The point to remember and what makes this relationship type a little more critical is that if you mark a dimension as rigid and there are changes then Analysis Services will error during processing similar to when a member changes parent in a non changing Analysis Services 2000 dimension.
The structure of hierarchies has changed quite a bit also. Traditionally I’ve worked on projects that have made heavy use of alternate hierarchies. There’s never been anything particular alternate about them apart from the naming, nothing that distinguishes them from any other dimension with some applications allowing you to slice by more than a single alternate hierarchy at one time. Analysis Services is capable of opening up all attributes as hierarchies that would be usable from a client. This may sound great but obviously becomes impractical when attributes such as telephone number or house number are included.
Attributes have a setting called the AttributeHierarchyEnabled property. Setting this to true means that the attribute will be available as a hierarchy on the dimension with an ‘All’ level and its members under that. Setting it to False makes the attribute behave in same manner as a traditional member property. There are many other important settings for attributes, for example if you want to add an attribute as a level to multilevel hierarchies, but you do not want users to see the attribute hierarchy in client applications, set AttributeHierarchyEnabled to True and set AttributeHierarchyVisible to False. These settings prevent users from browsing an attribute hierarchy without disabling it. The AttributeHierarchyVisible setting is ignored if AttributeHierarchyEnabled is set to False.
An attribute hierarchy will have an ‘All’ level if the IsAggregatable property is set to True and can also be added to any level of a hierarchy. If it’s set to False then it can only be used at the root of a hierarchy. Settings on the AttributeHierarchyOptimizedState property determine whether additional indexes are built on the attribute to improve query performance.
Many of the concepts within Analysis Services 2005 user and attribute hierarchies are going to be relatively easy to grasp for anyone involved with implementations on previous versions of the product. Personally I think that it should be easier to understand how to build dimensions that perform and emulate the behaviour of the business for someone starting out in Microsoft OLAP technologies. It’s been too easy in the past to build a cube and just ignore the advanced settings then let it run like a dog as it’s still faster than trying to write SQL that traverses multiple levels in a hierarchy. You can still build poor structures but 2005 seems to be a lot clearer in reasons why and the consequences if you don’t understand the additional properties of a dimension.
Thursday, May 26, 2005
Ouch...!
Now where are those asprin?
Sunday, May 22, 2005
Jesus Was a Carpenter..!
A friend of mine stayed over the weekend and as he is at one with the ways of the chippy, I assisted him completing a task at home that has been on my agenda for some weeks. It does make you wonder about the whole sitting on your backside for 10-12 hours a day looking at a screen, surely it can't be good for you.
After the whole IT bubble has burst and most of us geeks have been thrown to the Wolves it's either going to be a Yak farming job in Tibet or something practical like today. Until then........
Thursday, May 19, 2005
Bringing Balance to the Force
I've just seen the latest "film" in the Star Wars series. There are plenty of reviews out there, possibly even a non-biased one but here are my observations.
1. R2D2, what have they done to you. You now hop out of a ship like a demented jack rabbit with a blaster up your exhaust pipe. It's wrong, plain wrong.
2. Close your eyes and Hayden Christensen sounds like a character from Point Break. Bodacious master.
3. Solo, Yeah Han Solo, Remember him, no, never mind.
4. Take Endor, add a bit of water, pump the Ewoks full of steroids, you have Cheweys home planet.
5. Ian McDiarmid. This man is a god and by far the best thing in the George's "epics".
The subject will not be spoken of again........for a while.....maybe.
Sunday, May 15, 2005
Thursday, May 12, 2005
SQL PASS 2005 - New Cube Design Features - Part 1 - Dimensions
Analysis Services 2005 is way different. Not only in the way you develop the cubes but the way are eventually constructed, how you communicate with them but also how the end user will view them. Chris’ presentation focused on a subset of the new features focusing on the elements that are fundamental to pretty much every existing implementation of Analysis Services. Making a direct comparison to elements in the current version of AS is difficult but Chris communicated the relationships between the old and the new incredibly clearly.
To the features. The biggest change in the product is around the concept of dimensions. From a logical database perspective a dimension can be classified as a bunch of related attributes that when grouped form a table. This not exactly how a dimension behaves in Analysis Services 2000 as a dimension is essentially treated has a hierarchical object whether it contains single or multiple levels. You then add properties to the levels etc. for added informational value. In 2005 the structure of a dimension follows, in my personal opinion, a purer logical format. A dimension is treated as a collection of attributes and then presentation views of that dimension are separately applied as hierarchies. Separating the dimensions core attributes from the hierarchies is a great move meaning a tremendous amount of flexibility on how data can be presented to the user but without complicating the dimension relationship with a fact.
It’s going to interesting to see how the reporting client vendors deal with such a fundamental change to the cube structure. Now getting a hierarchy to display and getting a hierarchy to perform are just as important as they are now. This is achieved using our memory hungry little friends, member properties. The member properties are now the linkage between levels within a hierarchy. Chris’ example almost word for word was this; If there is a one to many relationship between attributes State and City, State will be a member property of City. A very important thing to remember is that the dimension wizard will not build most of these member properties meaning there is still plenty of work to do, certainly in this release, before a cube is what you may term as ‘production ready’.
I’ve decided I’m going to have to split these posts as they’re getting too large…..
SQL PASS 2005 - Reporting Services Enhancements
When people saw the first version of SQL Server 2000 Reporting Services the first thing said was probably something along the lines of "where's the MMC snap-in?". Visual Studio has a vastly better environment for this kind of development and thank the lord MS decided to continue this throughout the toolset (que sandals and beards being thrown from the old skool, burn him, BUUURRNN HIMMMM..!). Developers know what they are going to get with Visual Studio and people in the BI world and DBA are going to get more benefit from an intuitive environment that stabilises and speeds up dev time tremendously.
OK less on the environment and more on the presentation and product. Dave Wickert from the SQL Server product unit was the Microsoft employee working the crowd for this one. I managed to speak to him briefly on the evening before and when he said all of his demonstration were to based on OLAP data the words 'Christ' and 'Thank' may have slipped out. I’m never going to be able to go over all the enhancements but I can certainly give an overview of the key areas that will be big in this version.
Right, leave here and go and write some MDX for an OLAP data source then stick that in a matrix, after you’ve watched some paint dry look at your report. OK that’s not at all fair but the current version was not as responsive as I certainly would have liked when it comes to multi-dimensional data. Trying to get the correct data to appear depending on what axis you put the dimension then the subtleties of the order the dimensions were on the axis was a pain. On a scale of 1-10, 10 being child birth, where would you place MDX on multiple hierarchies for display in a flattened dataset? I may use that as an interview question.
2005 Reporting Services with 2005 Analysis Services……….fixed. How the new version of RS interacts with AS 2000 I shall be doing some investigation on.
The next thing I want to mention are the little things. Those featurettes that when Dave mentioned them brought a wave of positive emotion to the room. Multiple selections on parameters, column sorting by measure or header, server trip for drill down (need to get some confirmation on this one but it has massive implications in low bandwidth scenarios), deployment via SQL Management Studio, floating column headers and so on and so on. These are the areas where users have been listened to which in itself is very refreshing, and I know from experience that some of these features could have saved me days of development in recent times.
Lastly is a very big new feature which I was lucky enough to have a glance at early last year and that is the Report Builder. This new tool is driven from the pre-defined data model and deployed to the Report Server for the users. What does this mean? Well if you’re familiar with Business Objects Universes then think of it as similar, sort of. You take a model, create relationships, calculated fields, etc. and put it on the server. Bob from accounts doesn’t need to understand one record from the Invoice table has a one to many relationship with records in the Invoice Line table, he just sees that he can select Invoice and Invoice Line from a list and drag it on to a report layout of his choice.
The tool has a few more tricks up its’ sleeve and I’ll leave the statements “an understanding of the semantic model” and “infinite drill” for another time. Microsoft are quick to state that this is not a replacement for the plethora of reporting tools out there but an extension of the interface. Either way it should open Reporting Services to another user community which for Business Intelligence is nothing but good.
SQL PASS 2005 - SSIS Performance
The presentation from Grant Dickinson of Microsoft and Henk van der Valk of Unisys was a demonstration, from my perspective, not of the new features but the capabilities of the product in the area of scalability. SSIS likes memory and that was made particularly obvious by the charts and graphs we were shown that showed the 64bit server wiping the floor with the 32bit (as you would expect) and in some cases the 32bit server not even completing the process after hitting its 3Gb process limit.
Another key aspect of the improvements in performance is SSIS's ability to parallelise. You could always get DTS to do this but it was never pleasent. SSIS can now scale up incredibly well and again we were shown screenshots of the processor workload during a package run and it was certainly utilising whatever power it could get hold of.
note: I'm one of these people that tends to believe a screenshot rather than not (ever the optimist).
Any ETL fans out there are going to find plenty that is very familar in their world. The components are standard ETL fare (merge, lookup, etc.) but the one that floats my boat? Slowly Changing Dimension, oh yes. All they need to do now is get the thing cluster / scale out and the top end ETL tools will be well within Microsoft's sights.
Wednesday, May 11, 2005
Database snapshots
Database snapshots are very cool. I think the only problem with them is that the possible uses are so numerous that I don’t think people have quite decided yet what they are best for. One possibility is that they are going to be great for all conceivable uses but the danger is that they will get misused and then all kinds of things will happen. The technology used for database snapshots does not belong to SQL Server but Windows so there are a number of restrictions. The first thing to remember is that a database snapshot behaves like any other database but, of course, is read only. You can select from it, join its tables even restore a database to the point in the snapshot was taken which sounds great but then you think point in time copies of the database will be huge. Have a number of them and you server is going to run out of disk very quickly but if you take a snapshot of a database and then go and checkout the size of the data file in explorer you get a surprise. Even though the file states its size as, lets say, 2Gb, the same as the original data file of the database the snapshot was taken from, its size on disk will only be 64Kb.
Database snapshot use a copy-on-write mechanism via NTFS sparse files (NTFS, note the disk technology being used, very important).This means that the snapshot only contains the pages that have changed on the database. If I went and updated 100Mb of data in my database then my snapshot would increase in size by 100Mb due to the updated pages being copied into the snapshot. Also this is done before the data is committed so rolling back has no effect on what has been written to the snapshot. Very helpful and that is in its essence what a snapshot is, not a replacement for backups, not a way of retaining a copy of historical data for reporting purposes (although I can almost certainly see it getting used that way) but it is a good safety net when making data changes manually or even procedurally. The important thing is that it doesn’t promote lazy administration and hopefully the restrictions on this new feature (and there are plenty of them) should ensure that.
Table partitioning
After the partition function comes the partition scheme that maps the partitions (whose number is defined by the function) to the filegroups and database. When the partition function and scheme are in place then a partitioned table can be built in the following syntax;
CREATE TABLE <table_name>
(
<column_list>
) ON <partion_scheme>(<bound_column>)
The great thing is that you can re-use the partition function again on another table if that’s how you want to partition the table. For the majority of the OLAP work I do that is almost always going to be the case and I would partition my tables by time just as I would partition cubes I built by the time dimension.
Fast file initialisation
Primary, secondary data and log files on SQL Server 2005 can be initialised almost instantly allowing for fast execution of tasks such as adding files to databases or increasing the size of an existing data file. Traditionally files were initialised to grab disk from previously deleted files. These files were initialised by first filling them with zeros, now this doesn’t happen and disk content is overwritten as new data is applied to the files.
There are a couple of caveats though. Firstly the Windows account the SQL Server service is running under must either be an administrator of the machine or have SE_MANAGE_VOLUME_NAME special privilege assigned to it. Secondly and perhaps more of an issue is that fast file initialization is only available on Windows XP or 2003 server systems. There are a lot of features in SQL Server 2005 that only become usable when on a XP or 2003 operating system but it’s likely the users who will adopt SQL Server 2005 are the ones with the latest server technologies so hopefully this shouldn’t prove so much of an issue.
SQL PASS 2005 - DBA, The Next Generation
Although I haven't done any hardcore DBAing for a few years I couldn't resist going to see Kimbery's presentation especially with a reputation like hers. She didn't dissapoint and i don't think anyone went away without learning at least one very useful feature in SQL Server 2005.
For me I can pick out specific areas I know I'll be using in the next few weeks. I'm going to do a post on each in no particular order and go into a bit more detail (well try to go into a bit more than Kimberly if that's at all possible).
Wednesday, May 04, 2005
Blogging on the Rails
Hmmmm. Saturated Fats..!
Anyway the cube is built and the first of several reports is complete all in one neatly packaged solution. Processing speed is good and I've managed to get myself over a million rows of decent demo data thanks to a great little tool called the Advanced Data Generator. I've used it in the past and it has always delivered what I needed so take note if you're stuck for several million rowsd of data.
Tuesday, May 03, 2005
Art of the Referentially Intact
Hopefully I would have now cleansed the data to a point at which it can't fail but it severly reduces what I can demonstrate. Lets see what spanners await me when I try to build some Reporting Services reports off of it.....!
Sunday, May 01, 2005
Thursday, April 28, 2005
The Munich Peer Festival
Very kindly my employers (Edenbrook) have offered me the opportunity to go. This years event will have a certain degree of spice as it will be the final one before the release of SQL Server 2005.
So if anyone reading this wants to buy me a beer out there, I'll be happy to oblige :)
Constructive Success
I am currently in the process of evaluating the production capabilty of SQL Server 2005 for a client. Whilst I've been using the tools In various capacities for well over a year I haven't had the opportunity to play much with the Report Builder functionality in Reporting Services.
I like Reporting Services and have done since I first played around with the beta some years back. It is the very essence of a Microsoft product, Simple to pick up but very powerful under the hood.
In the next version you're now given the ability to build your own reports and deploy them to the central server. This provides some great possibilities and moves the platform up a gear. I saw an early demo of the ActiveViews technology in Redmond shortly after Microsoft acquired it last year and it looked good then but after some hands on research I really think a lot more people will take a look at the Business Intelligence offerings from Microsoft.
Wednesday, April 13, 2005
Morning
Family - I live with my wife-to-be and our 2 children in a pleasent area in a pleasent town in the UK...
Work - I'm a consultant at Edenbrook, you can get more information from their website about the things we do and the services we provide...
Gaming - I was born of the Sinclair Spectrum (48K with the rubber keys, you know the one) generation and I've never really been able to get it out of my system. My current tools of choice are a Sony PSP and a Xbox.
Music - I've really got more and more into my music and for the last few years have been a regular camper at the VFestival in Hylands Park, Chemsford...
Film - I love watching them and I always wanted to have a go at making my own short. 1 day perhaps.
Anyway, that's me.
Enjoy