Master data and metadata. A subject close to my heart due to it's significant importance in what I call the data lifecycle. Data? Lifecycle? What on earth is he talking about now, I just wanted to get Oracle talking to SSIS? Well let's go a little off subject here and use a bit of a euphemism.
Take something simple that I think we all learnt at school, the water cycle. This is the continuous movement of water as it shifts location and state from ocean to atmospheric to ground water. Now I liken this to the way data moves through an entity whether it be an organisation or group of systems.
A good example of this is a common scenario in financial reporting. An accountant (that's the cloud up there) will read their profit and loss report for a particular department and use this as to calculate the following years budget or forecast. These estimates will then be entered into the budgeting and planing system (that would be the mountains, more likely though it's Excel :). The budget and forecast are imported into the data warehouse where the profit and loss report (the ocean perhaps?) is generated which is read by another accountant looking at the companies performance, who........ ad infinitum.
A very typical example but it demonstrates the fact that the behavior of data within an organisation is very organic and in a constant state of flux. Just because the original piece of data is sitting in a table somewhere it doesn't mean it hasn't evolved into a different beast elsewhere with different properties and meanings. Simple as it sounds, this makes life a little complicated when you add influencing factors such as SOX (Sarbanes Oxley) compliance that requires the demonstrability of internal controls. In BI speak this could be someone changing an attribute of a dimension member and proving who did it, when and why. One tiny change which to a developer may be minor but to a CEO moves them from the red to the black, exactly the kind of thing SOX tries to stop.
Now all this talk of oceans, cycles and socks is all very good but doesn't bring us any closer to knowing what the hell to do about managing master and meta data. Ok, lets break down some of things I've mentioned in to some key bullets.
- Dimension Management
- Compliance
- Process Flow
This list identifies some of the major reasons for having and requirements of any meta data and master data management mechanism.
In the next part I'll cover the these elements in more detail and how they can contribute to a more streamlined data strategy.