NetAppTips.com

09 Mar, 2010

Microsoft Business Intelligence on NetApp Storage – Part 2

Posted by: John Parker In: Syndicated ()

John S Parker – NetApp Microsoft Alliance Engineering Team

It has been a while since I got back to the subject of Business Intelligence (BI). Last time I had introduced the topic of Microsoft’s BI stack. This post is going to go in more detail about how NetApp can help provide an environment that helps to manage even very large and cumbersome environments.  Let’s jump right in and take a look at NetApp’s BI efficiency tools.

The data warehouse is a key component of any Business Intelligence activity.  The management of the data warehouse becomes critical as the data volume grows within the data warehouse.  Larger database in them create problems that smaller OLTP database do not have.  Ensuring Recovery Time Objectives (RTO), query performance on large data sets, managing indexes to meet the query needs of reports, keeping ETL jobs working ensuring that the data maintains relevancy, and increasing backup times as the volume of data grows are just a few of the problems that come with managing a data warehouse. This technical report will demonstrate how NetApp helps to give administrators and stewards the tools they need to meet the SLA’s for their environment.

This section will cover the entire solution discussing key tools to helping ensure that the data warehouse can be developed and managed on a NetApp storage platform, Extract Transform and Load using FlexClones, Analysis Services guide to backup and recovery, cloning AS cubes and integrating with Analysis Services Scalable Shared Databases (SSDB).

NetApp brings many different tools to help manage a data warehouse.  Looking at each of the items described above will show solutions to each of these problems with managing a large data warehouse on Microsoft SQL Server.

Development and testing of indices

One challenge of large SQL Server databases is developing and managing indexes.  The creation of an index on a large table can at time be very intrusive in the database. This affect of creating large indexes requires more thought and development put into the index before deployment to production.  Large queries require effective indexes in order to perform correctly and return data in a reasonable time frame.  NetApp’s solution to helping ensure that you have quality indexes is through an improved development and test process.  The key to this is having an appropriate data set in the development or test environments that matches production. 

The traditional method of doing this is with a complete copy of the database.  The challenge there is now you suddenly have to spend more for storage.  NetApp’s FlexClones allow you to have full access to the data in a read/write snapshot.  This database can be on hosted on a separate SQL Server instance thus moving the processing over to another location. This method reduces the impact to you production environment and allows database developers access to a full set of data

The process for creating clones can be done through SnapManager for SQL Server (SMSQL) or through Power Shell scripts that create the clone and attach it to a specific SQL Server instance.  The clone can reside on either the host for the parent database or on a separate SQL Server instance.  For more information about creating clones with SMSQL please see the best practice guide for SMSQL 5.0 at http://media.netapp.com/documents/tr-3768.pdf.

Specialized reporting clones

Specialized reporting can create unique changes in the larger data warehouse environments. The queries required to support specialized reports at times require specialized table statistics or tailored query plans to be effective.  In a traditional data warehouse environment specialized queries are often compromised due to the requirements of higher priority queries.  The compromise can lead to a combination of duplicate tables and even databases to satisfy the needs of certain reports.  Year end and quarterly reports are prime examples of specialized reporting that can require different statistics for query performance.

The NetApp solution to this issue is to create FlexClones that allow the customization of a database to handle these specific requirements. The ability to make a read/write clone of the data warehouse allows the administrator to create specific queries for specialized reports, these specialized queries can have different table statistics to allow each report to run in the required time. 

The key to a successful business intelligence environment is the ability to get consist and fast reports from the reporting environment.  The reporting environment for this technical report consists of creating FlexClones of the data warehouse to enable specific statistics for specialized queries.  These specialized databases allow for deviation of statistics for specific tables and enables the use of the same code base for stored procedures execute different query plans to successfully build reports.

Extract Transform and Load

The processing of the Analysis Services cube and the loading of the data warehouse are broken into stages that break up the process into manageable components.  The cube is processed by a job that runs on a schedule managed from the Analysis Services host.  The cube is processed once a day.  After the cube has been processed for the day then it is cloned and the clones are attached to separate Analysis Services instances.  Cloning of the cube allows for query scaling. 

The data warehouse is loaded on a daily basis using an Integration Service job that gathers data from clones of the OLTP database.  This job takes place before the Analysis Services cube is processed.

ETL

Overview of Steps

  1. Create a FlexClone of the OLTP database. This can be done through the System Manager user interface or through the controller command line interface. 

  2. Attach the OLTP clone to the processing SQL Server Instance.

  3. Run the ETL Integration Services job extracting new data from the OLTP clone and populating the data warehouse.

  4. Process the Analysis Services cubes through a job on the AS host 1.

  5. Detach the Analysis Services cube from AS host 1.

  6. Create clone snapshot of the Analysis service directory.

  7. Reattach the Analysis Service cube on AS host 1.

  8. Attach clone to AS host 2-5.
  9. Backup up the data warehouse.

Backup and Recover for Analysis Services

The current method for backup and recovery of Analysis Services databases is a copy of the database and its components to a separate area. The copy of the database requires additional disk space and time to create the copy.  The NetApp approach to backup and recovery is to take a snapshot of the volume that the data is currently residing on.  This snapshot gives you a point in time that you can restore the database.  The key for this recovery is to ensure that the location of the data is in the volume that you are taking the snapshot. 

Snapshots allow for faster backup and restore times.   It is important to ensure that you have the backup take place after each change to the cube.  Scheduling the snapshot after the Analysis Service cube is processed will ensure reliability of your backup. 

The recovery process requires that you ensure that all Analysis Services processes are off so that the restore of the snapshot will be complete.  After restoring the snapshot then restart the Analysis Services process to resume database operations.

Scalable Shared Databases and Clones

Analysis Service 2008 introduced the ability to mount read-only databases on separate Analysis Service hosts making it easier to create a reporting farm.  This feature is Scalable Shared Databases (SSDB) that allows for the creation of multiple read-only databases to help ensure the reporting needs for heavy reporting environments. The method for creating these databases comes down to creating multiple copies of the database and moving them to the multiple hosts.  This copy can be time consuming and expensive due to the actual database space requirements. NetApp can help reduce both the amount of time required to copy the data but also by using FlexClones reduce the amount of storage required to create these SSDB’s in the reporting farm.

The cloning process can work either against previous snapshots of the LUN that has the Analysis Services cube.  The snapshot can then be transported to additional Analysis Services instances on separate hosts. Once the cloned LUN has been attached to the new host, attach the AS cube to the new instance.  The cloned cube can either operate in read-only mode or read/write.  The decision to use read-only or read/write is based on the needs of the reports taking place on the second AS instance.

ssdb


When mounting the new database with SQL Server Management Studio it is important to remember that having the correct mode if you need to have the database in read-only mode.  The FlexClone process can create a new version of the database that is either read-only or writable. The combination of using the SSDB and FlexClones allows fast deployment of reporting databases within the reporting farm.  There is also a reduction in the number of physical copies that require duplicate storage space within the environment.  This allows scaling quick and easily within the farm.

The Wrap-Up

The growing needs of today’s business intelligence systems and restrictions on storage budgets require an infrastructure that both flexible and efficient to meet the requirements of the business.  The ability to reduce the required amount of storage for the databases, while increasing the size and flexibility of the development and test environments helps increase the efficiency of the underling storage. NetApp clones allow database queries to be specialized without requiring multiple physical copies of the data.  This use of FlexClones for the database increases efficiency without increasing the amount of storage infrastructure required. 

The backup and recovery of an Analysis Services database and cubes with NetApp Snapshots is a fast and efficient process.  The backup of an Analysis Services database using NetApp Snapshots demonstrates the flexibility of NetApp in maintaining uptime and recovery times for the business intelligence environment.  Then integrating the FlexClone on the Analysis Services database creates a fast method for deploying Scalable Share Databases across multiple Analysis Services Servers. 

NetApp storage offers a solution to the expanding business intelligence environment through the use of SnapShots and FlexClones.  Snapshots help to provide backup and recovery for both Analysis Services and SQL Server data warehouses as well as providing the flexible development and test environments for both database platforms.

No Responses to "Microsoft Business Intelligence on NetApp Storage – Part 2"

Comments are closed.

Flickr PhotoStream

    Team NetApp at HQTeam NetApp at HQRyan Grant shaving Justin Tuck's head for St Baldrick'sTom Mendoza cheering the crowd onShavee, Organizer and Team Captain, Bill HoganNetApp employee making funny facesNetApp St. Baldrick's New York City 2011NetApp St. Baldrick's New York City 2011NetApp St. Baldrick's New York City 2011Justin Tuck shaving NetApp employeesJustin Tuck and Ryan Grant fans supporting St. Baldrick'sRyan Grant gets his head shaved for St. Baldrick'sGreen Bay Packers fans, Jodi Baumann, Senior Director of NetApp Corporate Communications, and her mother smile with Ryan GrantNetApp St. Baldrick's New York City 2011Justin Tuck greets children supporting the causeNFL players Justin Tuck & Ryan Grant join forces with NetApp and St. Baldrick's in NYC

Advertisements