Thursday, 17 November 2011

Data-tier Applications, DAC Framework and SQL Azure Integration with a Continuous Integration (CI) Process

:: Introduction

Leveraging the efficiency and cost savings of the "cloud" within an enterprise requires the use of specific tools applicable to the PAAS provider used. When it comes to working with SQL Azure (Microsoft's cloud based relational database engine), I have found that data-tier applications along with some supporting deployment infrastructure (PowerShell Scripts) work particularly well to synchronize both the database changes across a development team and to deploy those changes to other testing environments as they make their way to a production system.

:: Data-tier Applications

Data-tier applications are a project type within Visual Studio 2010 that allows development teams to manage database changes across ALL of their environments whether in the cloud or on-premiss. Developers work with individual SQL scripts within the project which define anything from table schema's to indexes to stored procedures. Once a change has been made to a data-tier application's scripts, the entire project can be compiled and a DAC Package (called DACPAC from here on out) is then created that can be deployed to a SQL Server 2008 R2 or SQL Azure database instance.

For details on how to create and setup a data-tier application for the first time see this link

:: DACPAC Files

A DACPAC is simply a file with a ".dacpac" extension that contains all the schema, stored procedures, users and policies required to setup a new database or upgrade an existing one. Note that DACPACs should not be confused with BACPACs. BACPACs are a newer format put forward by Microsoft and the file format can contain everything a DACPAC can but can also include data. 

There is also a version number (with a format of x.x.x.x) embedded within the DACPAC file which can be used to version the DACPAC. Unfortunately this version number is not updated automatically by project when the data-tier application is modified. Therefore a developer will need to increment it manually before deployment. To access and modify this version number right-click on the data-tier application project and find it under the "Project Settings" tab.

:: Deploying Data-tier Applications

Before a data-tier application can be continuously upgraded with a newer version (the typical workflow of any development team), it must first be deployed. The deployment essentially creates the initial database, sets it up as a data-tier application within the database engine (either SQL Server 2008 R2 or SQL Azure) and then goes about deploying all the schema, store procedures, users... etc that are contained within the DACPAC.

The DACPAC can be easily deployed from within Visual Studio by right-clicking the data-tier application project and selecting deploy (note that the connection string within the data-tier application project will need to be setup before doing this). Alternatively you can use SQL Server Management Studio (SSMS) or some PowerShell commands: How to: Deploy a Data-tier Application

Note that the above deployment is really just a one time operation. Once the initial setup has taken place and the data-tier application is deployed, you'll only need to upgrade the data-tier application from then on (unless you delete the database and start again which is handy for certain developers who might break things and need to start afresh).

:: Upgrading Data-tier Applications

There are a variety of mechanisms available to deploy a DACPAC file to a SQL Server 2008 R2 or SQL Azure database instance. Unfortunately the documentation available for these mechanisms is fairly minimal and therefore it makes it difficult to assess what mechanisms to use for each deployment scenario you may have within your enterprise. Here are the ones I have come across, the way to invoke them and the scenario(s) I think they are applicable to:
  • Within Visual Studio
    • Invoke: Within Visual Studio right-click on the data-tier application project and select "Deploy". Note that Visual Studio figures out
    • Details: Depending on the current configuration (e.g. Debug/Release) the appropriately set connection string with be used when the project is deployed. Therefore the connection string can be changed to access a local SQL Server 2008 R2 database instance or a SQL Azure database instance
    • Scenario: This deployment scenario is most applicable to developers working locally during  a typical development workflow of repeatedly modifying code, compiling and deploying it and then debugging the result. I highly recommend that developer's use a local instance of SQL Server 2008 R2 for two reasons:
      • No data transfer costs are accrued as ALL data transfer occurs on a developer's local machine and not back and forth between SQL Azure and their local machine.
      • DACPAC deployments are much faster and more reliable when performed locally since deploying a DACPAC to SQL Azure can take minutes/hours depending on the connection and the database connections can be dropped mid-deployment (this won't cause an error but the deployment will need to be restarted)
    • This 20min video, by a program manager on the Microsoft SQL Server Developer Manageability team, explains a range of information about data-tier applications and their use within Visual Studio.
  • Within SQL Server Management Studio (SSMS)
    • Invoke: Within SSMS open the "Management" folder and then the "Data-tier Applications" node. Right-click the data-tier application you would like to upgrade and select "Upgrade Data-tier Application..."
    • Details: Just follow the on-screen instructions making sure to save the SQL Script that the DACPAC uses to apply the changes to the database instance. On some occasions the upgrade wizard has failed but because I saved the SQL Script I could execute it myself within SSMS.
    • Scenario: This is the easiest way to get started when deploying a DACPAC to SQL Azure since SSMS provides a easy to follow wizard. Note however that I have seen this entire process take 1-2 hours to upgrade a database (40 tables + 50 stored procedures). Being able to automate this step is very helpful which the next section on PowerShell scripting will explain.
  • Using a DAC Framework enabled PowerShell Script
    • Invoke: You'll need to create your own PowerShell script to manage the upgrade process of one or more of your data-tier applications residing on SQL Server 2008 R2 or within SQL Azure. Note that if you are unfamiliar with PowerShell and the DAC Framework creating a robust and reliable script will take significant development effort.
    • Details: I created my own parameterized PowerShell script that accepts a DACPAC file and some credentials to a SQL Azure database, subsequently connects to it and then upgrades the existing data-tier application. Two sources of information that I used when creating my PowerShell script were the following:
    • Scenario: This is the most appropriate method for building an automated process for managing data-tier applications that can be used by a continuous integration (CI) system.  
    • Note that side-by-side upgrades are the older way to upgrade a data-tier application and were used with the DAC Framework v1.0. There is a newer upgrade process called in-place upgrades and is supported by the DAC Framework v1.1. I decided to use the in-place upgrade since it doesn't create copies of the "pre-upgrade" database. In order to perform an in-place upgrade using PowerShell make sure to use the DacStore.IncrementalUpgrade Method and not the DacStore.Upgrade Method which is now obsolete. Also note that the DacUpgradeOptions passed into the DacStore.IncrementalUpgrade Method control things like whether or not to ignore data loss, ignore drift, skip policy validation and/or perform rollbacks on failure. Make sure to select these options properly for each environment that you're upgrading (i.e. you may not care about drift on your local or CI databases but in your production database database drift should be a cause for concern).
:: Data-tier Application Upgrade Architecture across Environments

The following diagram shows how I used a data-tier application's DACPAC output to upgrade various SQL Server 2008 R2 and SQL Azure database instances across various environments (from local development through to production).


Note that only the changes to the SQL scripts within the data-tier application in Visual Studio are committed (not the DACPAC output that is built locally). The DACPAC is rebuilt on CI using NAnt and MSBuild and is used to immediately upgrade the CI's SQL Azure database instance. The same DACPAC is then archived and used in subsequent deployments for Staging, UAT and Production. The reason behind this build architecture of "build once and deploy to all subsequent environments" is outlined in the book called Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation.

Also note that the "Upgrade" step in both the CI and Staging/UAT/Production environments is performed via the parameterized PowerShell script mentioned in a section above.