Thursday, February 27, 2014

BI Requirements Gathering

Requirements gathering is a critical phase in any IT project, both technical and non-technical. It is understood that requirements always come before design. Requirements focus on "building the right system"; design lays emphasis on "building the system right". Considering the vast scope that BI offers, gathering requirements for a BI system is a challenging proposition. A business intelligence system should deliver an environment that is business focused rather than an application that performs specific functions. BI should be embedded into the business process that can be used to improve business processes.

Gathering requirements is a three-step process which is commonly known as elicit, specify, and test (EST). Requirement eliciting is the practice of collecting requirements from the stakeholders through interviews, questionnaires, brain storming  and prototyping. The most important part of this phase is to identify what the users 'need' (that can be successfully built) by asking the right questions. 


The complete description of each requirement along with non-functional requirements of the system is captured in requirement specification. Each requirement in the specification describes a property or characteristic which the BI system should possess in order to add value to the business users. 


A requirement specification includes:

WHAT Describes the requirement along with features of individual attributes. The requirement traceability matrix is also included which enables tracking of requirements.
WHY Provides the rationale for the BI system, describing the value to be achieved
WHO Describes the source of requirement and the business users who will benefit from the system

No matter how well a BI system is developed, it will only do what the requirements ask it to do. Requirement testing is a very important step in the process. As part of requirement testing, each requirement must be validated to ensure that it is clear, unambiguous and feasible.

On a lighter note, a comic strip to summarize the importance of requirement gathering in a project life cycle:


More to follow in my next blog!

References:

http://www.miproconsulting.com/blog/2009/06/bi-requirements-gathering
http://www.kimballgroup.com/data-warehouse-business-intelligence-consulting/business-intelligence-requirements/
https://cours.etsmtl.ca/mti820/public_docs/lectures/TenMistakesToAvoidWhenGatheringBIRequirements.pdf
http://tdwi.org/research/2008/09/ten-mistakes-to-avoid-when-gathering-bi-requirements.aspx

Thursday, February 13, 2014

Business Intelligence on Cloud

In the last few years, cloud computing has grown exponentially and is expected to become a billion dollar industry by 2016. Business Intelligence is all about delivering the right and meaningful information to the right people at the right time. Cloud computing provides a platform for BI applications which is agile and lightweight. Combining business intelligence and cloud computing could unleash greater benefits to a firm.

There are a few key factors which work in favor of Cloud Business Intelligence:

  • Speed of Implementation and Deployment: Very easy to set up and deploy, since no additional hardware or software installations are required.
  • Elasticity: The immense computing power available on cloud can be leveraged, and it can be scaled up and scaled down based on changing needs.
  • Accessibility: These applications allow mobile and browser users to access and control everything on the cloud.
  • Cost Benefits: Data center and IT management costs are greatly reduced, cost-effective pricing models offer cost-effective options.
A cloud model is typically categorized into Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS):



- IaaS provides data storage, managed networking, virtual servers, load balancers and operating system environments.

- PaaS, which sits on the IaaS layer, includes managed middleware, such as databases, application servers, data integration services and BI platforms.

- SaaS extends this stack to include the application layer. Most of the vendors provide cloud business intelligence as an online service which can be accessed through web based interfaces.

Cloud BI Vendors - Currently, several startups such as Birst, GoodData and Indicee are the niche players in the cloud BI market whose principal delivery model is cloud. Big players such as Microstrategy, SAP and Oracle are also expanding their cloud capabilities and are steadily moving into the cloud space.


Though companies are still wary of the capabilities of cloud, many of the enterprises are migrating from their on-premise applications to cloud in order to experience the potential benefits - reduced costs, improved flexibility and faster deployment. Cloud solutions offer self-service BI and on-demand BI through private or public cloud or even a hybrid model based on business needs.

Cloud BI is here to stay!

References:

http://www.tcs.com/SiteCollectionDocuments/White%20Papers/HighTech_Whitepaper_Business_Intelligence_Cloud_0412-1.pdf
http://www.klipfolio.com/resources/articles/what-is-cloud-business-intelligence
http://www.informationweek.com/software/information-management/5-big-business-intelligence-trends-for-2014/d/d-id/1113468

Wednesday, February 5, 2014

Practical Steps for Dimensional Modeling


In my previous blog, I have highlighted some of the key differences between a star schema and an OLAP cube. Although OLAP cubes deliver superior performance due to aggregations and pre-calculations, it is recommended to first build a star schema which contains detailed, atomic information, which then serves as a base/foundation for optional OLAP cubes.

Some of the key reasons for this:
  • Each OLAP cube has to be re-built and processed (every time) to analyze real-time data. 
  • It is difficult to design an OLAP cube to answer most of the business questions.
  • An OLAP cube doesn’t have ad-hoc capabilities – calculations have to be designed into the cube.
  • Size of an OLAP cube usually exceeds several MB and therefore presents space constraints.
Now that the importance of a star schema (or a dimensional model in general) is recognized, how do we go about building it? 

Four-step process for Dimensional Modeling:


1. Select the Business Process - The first step where the business process (usually a verb) is identified by understanding the business requirements and the underlying data sources. In a retail store, typical business processes include purchasing, orders, shipments, inventory etc.

2. Declare the Grain - A critical step in dimensional modelling which specifies what exactly a single row in a fact table represents. Grain is usually the lowest level of detail in a warehouse. A highly granular model will be able to handle unexpected user requests to drill down to the lowest level of detail. Based on the business needs, product transaction details at a day level, order details at a month level (aggregated) are some of the examples of grains in a retail store.


3. Identify the Dimensions - Dimensions can be easily identified once the grain of the fact table has been established. These dimensions provide contextual descriptions to questions about the grain in a fact table. For example, dimensions such as date, product and store will be key to answering various questions on a particular transaction in a retail store.

4. Identify the Facts - The final step where the facts which appear in the fact table are decided. These facts can be additive, semi-additive or non-additive and should always be true to the grain which is defined in Step 2 of the dimensional modeling process. Sales amount, quantity etc. should be captured in the fact table at the grain defined previously (product and day level). If there are facts which are at a different grain, they should be part of a different fact table. 

Although the process seems to be fairly straight-forward in theory, a number of best practices need to be followed to design an effective dimensional model:

  • Good understanding of business requirements along with knowledge of underlying source systems improves the chances of addressing most of the user requests.
  • Data modeling should be a collective team effort. A data modeler should lead the team consisting of business users, analysts and BI developers. 
  • Once there is consensus on the high-level diagram, the team should delve deeper into the model by getting into discussions about each of the tables, columns and the data rules governing them. 
  • The last phase involves validation of data model with the team members, IT, business users and other concerned parties. Once the model is approved by the steering group, further steps to implement a data warehouse can be taken.

To summarize, a lot of factors have to be considered while designing a dimensional model irrespective of the nature of business and the complexity of requirements.

Are you now ready to design your first dimensional model? Well, I am! 

References:

The Data Warehouse Toolkit, Ralph Kimball and Margy Ross
http://www.seemoredata.com/en/entry.php?10-Differences-between-CUBES-and-Star-Schema
Wikipedia