Monday, November 02, 2009

Azure – It’s Not Big and it’s Not Clever

I’ve played with Azure and yes it does what it’s supposed to but coming from a business intelligence background I’m struggling to see what kind of advantage my area of focus will gain. I would love to see huge databases accessible in the cloud attached to huge cubes that use huge ROLAP partitions to run huge queries but until that time comes along I’m going to sit on a huge fence, watch and wait.

One project I’m working on has dimensions with over 50 million members in. Due to the business and the way it works a Process Update has to be performed every night. Lots of attribute hierarchies and lots of members means lots of time taken to complete. Throwing more hardware at it makes this faster but

what happens when the cost of the infrastructure exceeds the value they receive from their analysis. This client will look at how they can reduce costs elsewhere whether it be licensing (open source perhaps) or the platform (Amazon EC2 perhaps) neither of which are going to help Microsoft.

When I first saw that the Azure platform was changing from it’s early form into what I considered a natural progression of the SQL Server platform from Express to Enterprise to Azure I was pretty positive. This coupled with Analysis Services could mean smaller companies that produce a lot of data could get top class analytics without having to break the bank with massive hardware expense. But I haven’t seen anything that makes me feel confident that this will happen. Perhaps the Azure OLAP platform will be huge 128bit version of Excel with PowerPivot (don’t get me started) who knows, someone should at least.

Azure, PowerPivot, Reporting Services, Performance Point, ProClarity, MDM. In my opinion all of these things should be part of a single cohesive strategy to allow users to access data in a simple way whilst ensuring organisations can maintain a level of data governance and I’m just not seeing it. Why do I have to still tell a client that they need to use this tool because that’s the one you can change the line colours in but you’ll need to use the other tool for pie charts and oh you want it integrated into your Portal then you’ll need this one instead. It’s scatter gun, not particularly professional and very worrying.

I’m seeing this from a very focused view but from my experience a real world view and I just can’t see Azure fitting in to it at the moment. I would love to have my mind changed so please do.

Technorati Tags: ,

Thursday, July 30, 2009

The Grand Microsoft Crossover

The rumour on the grapevine is that Microsoft are preparing an all out offensive on rivals Google, Apple and Sony by drawing upon their expertise in the various groups around the Redmond giant.
With the positive reception gained by Project Natal at E3 earlier this year and the planned release of Windows 7 and Office 2010 over the next 18 months, Microsoft are embarking on a strategy that they hope will smite their competition.

Codenamed "Project Flap" Microsoft will make the Natal hardware the de-facto method for interfacing with all of their other offerings. Users will be able to use Natal to retrieve lost Word documents by either jumping up and down furiously in front of their screen or by using specially adapted four letter keywords. "This will revolutionise how people will work in the next decade" a Microsoft source close to Project Flap is quoted as saying. But it's benefits don't stop at improved interaction with the next generation of software "our research has demonstrated that looking like a complete tool in the workplace has an impact on how people are perceived". When pressed on whether this impact would be negative the source responded, "at this time we can't comment on the details of the research".

project-natal

Before everyone pins their hopes on every office looking like a rehearsal room for Britain's Got Talent, Microsoft's rivals are sure to respond. Google is rumoured to be looking at a free camera add on for Google Docs that will work in a similar way to Natal. The major difference being that users will have to stand facing away from the screen as adverts are projected on to their backs.
Sony also has plans in this space. They will introduce a new PS3 / Vaio office application that will provide users with a cut down feature set that will only work with Sony's proprietary SAD (Sony Augmented Document) format. Whatever Apple will do will of course be White and expensive.

So as Microsoft's rivals prepare to return fire and stick 2 fingers up at Natal they had better take care, flicking the "V" is the new command in Windows 7 for formatting your C drive..!

Tuesday, July 28, 2009

Over The Last Several Weeks I Have Been Mostly Listening To…

untitled

Hands – Little Boots

laroux[1]

La Roux – La Roux

florence_and_the_machine[1]

Lungs – Florence + the Machine

Spotify Playlist

A Little Something About PPS Filters (part 3)

It’s been a while since I started on this topic so hopefully I’ll remember what I was attempting to say.

For a particular client we have had issues with a multi level dimension hierarchy that contains several thousand members. As I’ve said before and numerous people have experienced PPS filter performance is not great (word on the grapevine is though this is no longer an issue in v2). For various reasons, creating multiple dashboards to satisfy different users and restricting the users using OLAP security weren’t options so we came up with a “third way”.

The third way was to provide the users with the ability to change their preferences and alter what they could see in the report hierarchy. First we needed to provide the users with a screen where they could make their selections. This was as simple as providing a PPS dashboard page that contained a filter and a report that used the filter.

The filter is based off of tabular data as explained in the last

post. The data itself is a parent child data set generated from the dimension that I want to restrict turning my source table of the dimension from something that looks like this;

image

To something that looks like like this;

image

There are reasons I’ve done things like place a letter a the beginning of each of the ID’s. First it ensures that every member has a different ID and secondly it identifies the level of the original hierarchy the member is from without having to write queries to work out this out.

This has been placed into a view which is just a series of select statements from each level of the hierarchy with unions between. With that source view a tabular filter can be built and put into a dashboard page. At the top of the view though I’ve put a statement in for the “All” member. Remember this isn’t Analysis Services so there’s no dimension and no “All” member.

image

Again referencing back to the previous post on this, for the filter value to be retained in the ParameterValues table you will need to associate it with a report. A simple reporting services report will do and my preference is to provide the user with some feedback on the selections they’ve made. If they have selected the “All” member then they get a list of all of the locations otherwise only the members that have been selected.

The reporting services input parameter is from the filter passed into some SQL that in this case looks like this;

IF EXISTS (
    SELECT 1
    FROM vwDimTown
    WHERE 'ALL' IN (@Location)) 
BEGIN
   SELECT DISTINCT CountryDesc, RegionDesc
   FROM vwDimTown
END
ELSE
BEGIN
   SELECT DISTINCT CountryDesc, RegionDesc
   FROM vwDimTown
   WHERE 'D' + CAST(CountyID AS varchar(5))
      IN (@Location)
   OR 'C' + CAST(RegionID AS varchar(5))
      IN (@Location)
   OR 'B' + CAST(CountryID AS varchar(5))
      IN (@Location)
END

With the reporting services report on the page we now have a mechanism for capturing a users preferences like so;


image


Retrieving the preferences is the next part so for that we build on the SQL on the previous post that shows how to extract rows from the ParameterValues table in a meaningful way and for this we will put the logic into a stored procedure.


The procedure will perform a number of tasks in addition to pulling the selected members out of the PPSMonitoring database. We also need the final output to be formatted as an MDX set in the format {member 1, member 2, member 3, member n}. So lets break down the procedure into its constituent parts. First the declarations.



CREATE PROCEDURE uspGetFilterPreference 
     @Login varchar(255)
    ,@FilterID varchar(50) 
    ,@StrToSet varchar(max) OUTPUT 

The first of the variables, Login, is the logon of the person whose preferences we are trying to find. FilterID is the GUID identifying the filter we are interested in. Finally StrToSet is the output variable that will contain my final MDX set string.


DECLARE @Keys table( 
MemberKey varchar(255)) 
DECLARE @OutputString varchar(max) 
INSERT @Keys(MemberKey) 
SELECT 
    REPLACE( 
        REPLACE( 
            CAST(d.Members.query('LocationID') 
                AS VARCHAR (255) 
                ), '<LocationID>', '' 
            ), '</LocationID>', '') 
    AS MemberKey 
FROM PPSMonitoring..ParameterValues 
CROSS APPLY SerializedXml.nodes
    ('/NewDataSet/MyDataTable') 
    AS d(Members) 
WHERE [Login] = @Login 
AND ParameterUniqueName = @FilterID

This section gets the values out of the ParameterValues table and loads them into a table variable called Keys. Along the way we strip out any XML syntax (element names, tags, etc.) leaving a table variable comtaining the key values from the tabular filter.



IF EXISTS (
    SELECT 1 
    FROM @Keys  
    WHERE MemberKey = 'ALL')
BEGIN 
    SELECT @OutputString = 
       '[Town].[County].Members' 
END

Here we check to see if the “All” member has been selected. If it has then we set the OutputString variable to be all members from the County attribute hierarchy of my Town dimension and NOT the user hierarchy. This is really important for later on.


If the “All” member has not been selected then we work out what members have been selected and using a set of union statements get a list of true dimension keys



ELSE 
BEGIN 
  SELECT @OutputString = 
    SUBSTRING((
      SELECT ', ' 
            + '[Town].[County].&[' 
            + CAST(ID AS varchar(10)) 
            + ']' 
      FROM (
       SELECT ID 
       FROM DimTownTabular 
       WHERE ParentLocationID IN 
         (SELECT MemberKey FROM @Keys 
          WHERE LEFT(MemberKey, 1) = 'B') 
       UNION 
       SELECT ID FROM dbo.DimTownTabular 
       WHERE ParentLocationID IN 
         (SELECT MemberKey FROM @Keys 
          WHERE LEFT(MemberKey, 1) = 'C') 
       UNION 
       SELECT ID FROM dbo.DimTownTabular 
       WHERE LocationID IN 
         (SELECT MemberKey FROM @Keys 
          WHERE LEFT(MemberKey, 1) = 'D')
            ) a 
  FOR XML PATH( '' )), 3, 1000 ) 
END 

This is why we have the ID column and the letter prefix. It makes this bit a lot easier to manage. The output of this and the value that get dropped into the OutputString variable is almost exactly what we want. The only issue is that by using the FOR XML PATH statement to produce a comma delimited string of the members it replaces “&“ symbols with “&amp;”. A quick …



SELECT @StrToSet = '{' 
                 + REPLACE(@OutputString,'.&amp;', '.&') 
                 + '}'

… fixes that and puts the curly brackets around to finish off our set syntax and the procedure as a whole.


Run the procedure against the username and filter you’re querying and depending on how many members you selected in the tabular filter you should get something like this;



{[Town].[County].&[26], [Town].[County].&[63]}

Now I would recommend that everyone goes out and gets this book






imageMDX Solutions
(Amazon link)

And go to chapter 10 where there is a section called “Using Stored Procedures for Dynamic Security”. This explains how to use a simple analysis service stored procedure (DLL) to affect what members in a dimension users are able to see. All the code is doing is retrieving a number of rows from a SQL query and passing it into analysis services security as an MDX set object. In my stored procedure I made it simpler and just executed a stored procedure and output a string.


Unfortunately I haven’t got this all packaged up as a project but my recommendation would be to create a C# project build the DLL and also build a command line application that references the DLL just to test everything is working OK.


image


Also things like error handling and connection string configuration to where the PPS database should all be handled in this code. My function is called “GetPreferredCounties” which has two input variables. The username and the filter ID, both as strings.


After registering the assembly with the Analysis Server we can now setup the final piece. In one of the roles in your cube (you have created one right?), within the dimension data security for the County attribute of the Town dimension I can now put in the syntax to make the stored procedure call and retrieve the string.


image


In my example it looks like this.



STRTOSET(PPSFilterASStoredProcs.GetPreferredCounties(
    UserName()
   ,"822e399a-af44-4cf3-be19-f1602ca14afb"))

The USERNAME function will pass through the current users logon credentials to the function and the STRTOSET function will take the output string from the function and convert it into something analysis service will understand is a set.


If all has gone well Analysis Services is now using the output from the stored procedure to provide a restricted list of dimension members to a user. A list that the user themselves can control without any other system being put in the way or elevated privileges being provided.


Build another page on your dashboard and this time create and add a report that just displays some info about your filtered dimension. Attach this to a filter based on the analysis services dimension (I tend to use named sets) and now you will only see the members that match your preferences selection.imageimage


Now to tidy up that hierarchy a bit. All those parent members that are just getting in the way and making the navigation more complex than it needs to be. Pay a visit to Boyan Penev’s blog and take a look at a post on Filtering Unneeded Dimension Members in PerformancePoint Filters. By making the source query or named set for your filter based on the logic in the post you just get the top most level necessary which makes the filter a hell of a lot neater and with the added bonus of the top level member becoming the default when you first access it.


image


There’s a lot going on in this post and unfortunately I haven’t had the chance to package it up into something I can redistribute easily but if there’s any questions then ask away and I’ll help if I can.

Thursday, June 04, 2009

A Little Something About PPS Filters (part 2)

Here I wanted to go through what happens when you click apply on a PerformancePoint filter. Well first lets look at the simple OLAP based filter that I put together for the last post.

Image001

On my blank dashboard page with only a single filter I click apply and then I’ll have my selection displayed so the filter looks like it’s registered the filter change.

Image002

The problem is that when I refresh the page PPS doesn’t remember my last selection and will return the All member. This isn’t what PPS filters are supposed to do but the reason is simple, what’s the point of having a filter if there’s no report connected to it?

After putting a simple reporting services report on the page and trying the filters again no problems. I can close the browser, log out, reboot my machine no matter what you do the parameter selection is remembered. So lets ask the question, how?

When a user accesses a report linked parameter their selections are stored in the PPS monitoring database in a table called ParameterValues. This table consists of a filter id, which is a GUID as a string, the users login credentials, the last update time and finally a column called SerializedXml.

SerializedXml is an xml typed field that contains a users actual selections for that particular filter. Select * from the table in management studio and click on one of the values in the SerializedXml column and you can see the xml displayed. Forget the schema tags and looking lower down in the xml document to the elements called ‘MyDataTable’.

<MyDataTable>
  <Key>c431349de280b5a3fc3f921d5ebffd67</Key>
  <DisplayValue />
</MyDataTable>
<MyDataTable>
  <Key>18eb638b9e0c1c9ab64aaacdcb21d4bd</Key>
  <DisplayValue />
</MyDataTable>

This is where one particular difference between OLAP and tabular filter appears. Look at a the SerializedXml value for an OLAP based filter and you’ll see a GUID appear in the key element. This would seem to be the dimension property MEMBER_GUID but I haven’t been able to confirm this on Analysis Services 2008 but it seems logical enough.

When looking at the SerializedXml for a filter based on tabular data the differences are quite clear.

<MyDataTable>
  <LocationID>England</LocationID>
  <LocationDesc />
</MyDataTable>
<MyDataTable>
  <LocationID>Scotland</LocationID>
  <LocationDesc />
</MyDataTable>

With tabular filters the key assigned within the parent / child relationship is the value stored in the xml and its possible to query it as well using SQL Servers xml query features. The example above can be queried using the following syntax;

SELECT d.Members.query('LocationID')
  AS MemberKey
FROM ParameterValues
CROSS APPLY SerializedXml.nodes
  ('/NewDataSet/MyDataTable')
    AS d(Members)
WHERE [Login] = @Login
AND ParameterUniqueName = @FilterID

By providing the login and filter id parameter we can get the selection for a single users filter. The result set is again xml based returning a row for each filter selection.

MemberKey
<LocationID>England</LocationID>
<LocationID>Scotland</LocationID>

The element names and tags can be cleared away easily enough with some string manipulation or if anyone know how the SQL xml query could be tweaked to use return just the values themselves I would be interested to hear.

In the next post I’m going to show how this can all be put together to demonstrate how these ideas can be applied to real world scenarios.

This Week I Have Been Mostly Listening To…

KitsuneMaison7

Kitsuné Maison Compilation 7 – Various Artists

Spotify Playlist

Wednesday, June 03, 2009

A Little Something About PPS Filters (part 1)

If you have a lot of members, say in the thousands, that you want to display in a PPS filter then you’re going have problems. The current version of PPS doesn’t handle them to well and I surmise that it’s one of the reasons that there’s a configurable setting in the web.config allowing to set the maximum number of members returned in a filter.

Things are compounded when you have hierarchical filters as the number of members is increased by parent members. The obvious way to get around this off the top of your head would be cascading filters but this has its own issues one in particular is that I want the user to be able to use all of the hierarchy in the reports from the all level to the leaf level members.

Hopefully the architecture of PPS v2 will have corrected this little foible but in the meantime we have to work around this. I’m going to talk in the next few posts about the filters and demonstrate a potential workaround but wanted to start off with a bit of an overview on two types of filters available in PPS and that’s OLAP based and tabular based.

The majority of people will have put together an OLAP data based filter. It’s very simple and you can choose to either select the members directly from the dimension, use a named set or a MDX query. I would hazard to guess that most developers will use either named sets or MDX queries.

I’ll use a simple MDX query to generate a set from a dimension I’ve got called “Towns”. This dimension contains a single user hierarchy that looks like this;

  • Country
    • Region
      • County
        • Town Prefix
          • Town

As I only want to display data to the county level in my filter I can set this up with this MDX;

DESCENDANTS(
   [Town].[Town].[All]
// The top level member
  ,[Town].[Town].[County] // down to the County level
  ,SELF_AND_BEFORE) // including all members

When the filter is built and deployed you’ get something that looks like this;

Image001

To achieve the same with a tabular filter you need to put together a parent child structured data set. As my source table has individual fields for each of the hierarchy levels this will need to be created manually using a series of union statements. For example;

SELECT
  'All' AS Location 
,
'All' AS ParentLocation
UNION ALL
SELECT
  Country AS Location
,'All'
AS ParentLocation
FROM
  dbo.DimTown
UNION
ALL
SELECT
  Region AS Location
,Country AS ParentLocation
FROM
  dbo.DimTown

Now none of what has been written above is new, Earth shattering or of any particular benefit to anyone apart from those who have never touched PPS before but I wanted to mention the distinction between OLAP and tabular data sourced filters as it will become quite important in the next post around how PPS remembers what your last selection was.

Tuesday, March 10, 2009

This Week I Have Been Mostly Listening To…

665.x600.mr.titus.rev[1]

The Airing of Grievances – Titus Andronicus

Spotify Playlist

Wednesday, March 04, 2009

Crash, Bang, WOLAP..

I’ve been working on a SQL Server 2008 project that uses proactive caching on some partitions to speed up the availability of the data to the end user.

For all the things that have been said and written about ROLAP in the past apart from some issues around traces I’ve found it to behave itself pretty much. Well that is until now.

Since moving into a test environment we’ve been getting a lot these errors;

The description for Event ID 22 from source MSSQLServerOLAPService cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Internal error: An unexpected exception occurred.

the message resource is present but the message is not found in the string/message table

Actually these errors would come thick and fast every couple of seconds or so until the Analysis Services service decided to stop. Profiling the Analysis Server showed the problem from the OLAP server point of view where it seemed to happen when the aggregations were being merged.;

Internal error: An unexpected exception occurred. Errors in the OLAP storage engine: An error occurred while processing the 'Current Day' partition of the 'Fact Table' measure group for the 'AS2008 Bug Test' cube from the AS2008 Bug Test database. Server: The operation has been cancelled.

I took the OLAP database that I was using and broke it down until I had the minimum amount of code and objects left in there. This meant stripping out the calculations, all of the measure groups, apart from 1, and most of the dimensions. Finally I was left with a cube that had 2 dimensions (one of which was a date dimension) and a single measure.

Within the measure group were 2 partitions, one was standard MOLAP the other a real-time HOLAP partition looking at the same fact table but different date ranges (current day and everything else). Now here’s the thing, while there is data in the HOLAP partition there are no errors and no crashing. With the HOLAP partition not processed, again no errors but as soon as the HOLAP partition was processed and had no data to retrieve then there would be errors galore resulting in the inevitable crash.

A further step back showed that this would all behave as long no aggregations were applied to the HOLAP partition. As soon as a single aggregation was put on the partition and the partition processed the errors would appear until either data was inserted into the source table, to subsequently be picked up in the HOLAP partition, or Analysis Services crashed.

Curiously converting the partition to real-time ROLAP didn’t deliver any errors with or without aggregations and same is true for any of the MOLAP types of storage but as soon as it moved back to HOLAP the errors returned.

Some extra detail. I’ve observed this on both 32bit and 64bit systems. The 64bit server this was first spotted on has the cumulative update package 3 for SQL Server 2008 installed and the 32bit is RTM so it looks like the issue is throughout the codebase. We haven’t gone as far as to install the SP1 CTP due to environment constraints. This looks like it might be a bit of a bug so I’m in the process now of packaging up something to send to Microsoft.