Easy Sitecore 9 Azure PAAS no downtime deployments

Disclaimer: It’s not really easy, just easier than alternatives.

This is also known as Blue/Green deployments

Having your site stay up during downtime is a very common ask for any website.  Sitecore comes with it’s own set of challenges, however with a few simple tips and tricks in Azure you can get a very robust solution for a minimal effort.

Attributes of this approach

  • There is downtime for authors in the CM environment.
  • There is no content authoring freeze. (however while a deployment is going on there is a publishing freeze, mitigated by an optional search index swap covered later)
  • Azure assets are created on demand so there is no offline environment hanging out doing nothing but costing money.
  • Orchestrated by powershell
  • The Issues

    Primary Issue

    A deploy is a 2 step process.  You need to publish new templates, renderings, or other developer owned Sitecore items to the content delivery database and you need to deploy the code that knows how to work those new templates. No matter how hard you try, you can’t do these things at the same time perfectly. This leaves the possibility of end users seeing server errors.

    Secondary Issues

    There are two secondary issues that are optional which will be discussed later Search index and Xconnect. These are secondary because they lead to some potentially annoying results, but not likely a server error.

    Solving these problems

    I’m going to focus on solving the primary problem for simplicity. Note that the diagram below has steps for Search Index replication. For simplicity in this blog i’ll focus on blue/green without search index handling.

    Sitecore 9 blue-green model.png

    To accomplish many of these tasks we’ll heavily be utilizing Kudu. Basically it’s a rest api suite for Azure app services.

    Process outline

    How will this process effect specific groups?

    I’m an author

    1. There will be brief downtime in the content management environment
    2. Content management will come back up with the new code and templates
    3. Content editing is allowed
    4. Publishing will send changes to the staging slot content delivery URL (NOTE: if you’re not duplicating a search index, this could impact your end users if your components are sourced by the search index)
    5. Once blue/green completes the changes that you published will be end user facing
    6. Business continues as usual

    I’m a dev-ops professional

    1. Authors have been warned about a brief downtime in CM
    2. Blue/green process is kicked off
    3. CM and CD deployments are done
    4. Alert testers and wait for testing to complete
    5. On successful test, swap staging slots to production, on failure wait for hotfix and deploy to the environments again, on catastrophic failure initialize rollback
    6. If Success, initialize finalize to clean up unused offline environments

    I’m a tester

    1. Get alerted by dev-ops team that the deployment to the staging slot is complete
    2. Break it
    3. Alert development team an emergency hotfix is needed
    4. Wait for dev-ops team to report the hotfix has been deployed
    5. Test again, no breaking this time
    6. Report to dev-ops team all is well

    The Powershell

    NOTE: These powershell functions all require the powershell context to have an authenticated Azure connection to perform it’s tasks. I recommend using a service principal for this.

    Utility functions

    This set of utility functions is mostly for file I/O with the Azure App Services. Saved in a file called “Get-KuduUtility.ps1”

    function Get-AzureRmWebAppPublishingCredentials($resourceGroupName, $webAppName, $slotName = $null){
    	if ([string]::IsNullOrWhiteSpace($slotName) -or $slotName.ToLower() -eq "production"){
    		$resourceType = "Microsoft.Web/sites/config"
    		$resourceName = "$webAppName/publishingcredentials"
    	}
    	else{
    		$resourceType = "Microsoft.Web/sites/slots/config"
    		$resourceName = "$webAppName/$slotName/publishingcredentials"
    	}
    	$publishingCredentials = Invoke-AzureRmResourceAction -ResourceGroupName $resourceGroupName -ResourceType $resourceType -ResourceName $resourceName -Action list -ApiVersion 2015-08-01 -Force
        	return $publishingCredentials
    }
    
    function Get-KuduApiAuthorisationHeaderValue($resourceGroupName, $webAppName, $slotName = $null){
        $publishingCredentials = Get-AzureRmWebAppPublishingCredentials $resourceGroupName $webAppName $slotName
        $ret = @{}
        $ret.header = ("Basic {0}" -f [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $publishingCredentials.Properties.PublishingUserName, $publishingCredentials.Properties.PublishingPassword))))
        $ret.url = $publishingCredentials.Properties.scmUri
        return $ret
    }
    
    function Get-FileFromWebApp($resourceGroupName, $webAppName, $slotName = string::Empty, $kuduPath){
        $KuduAuth = Get-KuduApiAuthorisationHeaderValue $resourceGroupName $webAppName $slotName
        $kuduApiAuthorisationToken = $KuduAuth.header
        $kuduApiUrl = $KuduAuth.url + "/api/vfs/site/wwwroot/$kuduPath"
    
        Write-Host " Downloading File from WebApp. Source: '$kuduApiUrl'." -ForegroundColor DarkGray
        $tmpPath = "$($env:TEMP)\$([guid]::NewGuid()).xml"
        $null = Invoke-RestMethod -Uri $kuduApiUrl `
                            -Headers @{"Authorization"=$kuduApiAuthorisationToken;"If-Match"="*"} `
                            -Method GET `
                            -ContentType "multipart/form-data" `
                            -OutFile $tmpPath
        $ret = Get-Content $tmpPath | Out-String
        Remove-Item $tmpPath -Force
        return $ret
    }
    
    function Write-FileToWebApp($resourceGroupName, $webAppName, $slotName = string::Empty, $fileContent, $kuduPath){
        $KuduAuth = Get-KuduApiAuthorisationHeaderValue $resourceGroupName $webAppName $slotName
        $kuduApiAuthorisationToken = $KuduAuth.header
        $kuduApiUrl = $KuduAuth.url + "/api/vfs/site/wwwroot/$kuduPath"
    
        Write-Host " Writing File to WebApp. Destination: '$kuduApiUrl'." -ForegroundColor DarkGray
    
        Invoke-RestMethod -Uri $kuduApiUrl `
                            -Headers @{"Authorization"=$kuduApiAuthorisationToken;"If-Match"="*"} `
                            -Method Put `
                            -ContentType "multipart/form-data"`
                            -Body $fileContent
    }
    
    function Write-ZipToWebApp($resourceGroupName, $webAppName, $slotName = string::Empty, $zipFile, $kuduPath){
        $KuduAuth = Get-KuduApiAuthorisationHeaderValue $resourceGroupName $webAppName $slotName
        $kuduApiAuthorisationToken = $KuduAuth.header
        $kuduApiUrl = $KuduAuth.url + "/api/zip/site/wwwroot/$kuduPath"
    
        Write-Host " Writing Zip to WebApp. Destination: '$kuduApiUrl'." -ForegroundColor DarkGray
    
        Invoke-RestMethod -Uri $kuduApiUrl `
                            -Headers @{"Authorization"=$kuduApiAuthorisationToken;"If-Match"="*"} `
                            -Method Put `
                            -ContentType "multipart/form-data"`
                            -InFile $zipFile
    }
    
    function Copy-AppServiceToStaging($resourceGroupName, $webAppName){
        $KuduAuth = Get-KuduApiAuthorisationHeaderValue $resourceGroupName $webAppName
        $kuduApiAuthorisationToken = $KuduAuth.header
        $KuduStagingAuth = Get-KuduApiAuthorisationHeaderValue $resourceGroupName $webAppName "Staging"
        $kuduStagingApiAuthorisationToken = $KuduStagingAuth.header
    #NOTE: you must copy all paths of the webroot that aren't involved in your deployment
    #For example if you also wanted to copy the Sitecore folder you could change this to:
    # @("App_Config", "App_Data", "Sitecore")
        @("App_Config", "App_Data") | ForEach-Object {
            $kuduConfigApiUrl = $KuduAuth.url + "/api/zip/site/wwwroot/$_/"
            $tmpPath = "$($env:TEMP)\$([guid]::NewGuid()).zip"
            try{
                $WebClient = New-Object System.Net.WebClient
                $WebClient.Headers.Add("Authorization", $kuduApiAuthorisationToken)
                $WebClient.Headers.Add("ContentType", "multipart/form-data")
    
                $WebClient.DownloadFile($kuduConfigApiUrl, $tmpPath)
    
                $kuduConfigApiUrl = $KuduStagingAuth.url + "/api/zip/site/wwwroot/$_/"
                $kuduApiFolderUrl = $KuduStagingAuth.url + "/api/vfs/site/wwwroot/$_/"
                Invoke-RestMethod -Uri $kuduApiFolderUrl `
                    -Headers @{"Authorization"=$kuduStagingApiAuthorisationToken;"If-Match"="*"} `
                    -Method PUT `
                    -ContentType "multipart/form-data"
                #need a sleep due to a race condition if this folder is utilized too quickly after creating
                Start-Sleep -Seconds 2
                Invoke-RestMethod -Uri $kuduConfigApiUrl `
                    -Headers @{"Authorization"=$kuduStagingApiAuthorisationToken;"If-Match"="*"} `
                    -Method PUT `
                    -ContentType "multipart/form-data" `
                    -InFile $tmpPath
            }finally{
                if (Test-Path $tmpPath){
                    Remove-Item $tmpPath
                }
            }
        }
    }
    function Get-DatabaseNames{
    	param(
    		[Parameter(Mandatory = $true)]
    		[string]$ResourceGroupName,
    		[Parameter(Mandatory = $true)]
    		[string]$AppServiceName,
    		[Parameter(Mandatory = $true)]
    		[string]$DatabaseNameRoot,
    		[string]$SlotName = string::Empty
    
    	)
    	$contents = (Get-FileFromWebApp -resourceGroupName $ResourceGroupName -webAppName $AppServiceName -slotName $SlotName -kuduPath "App_Config/ConnectionStrings.config") | Out-String
    	if ($contents.Contains("$DatabaseNameRoot-2")){
    		$ret = @{
    			InactiveDatabase = $DatabaseNameRoot
    			ActiveDatabase = $DatabaseNameRoot + '-2'
    		}
    	}elseif ($contents.Contains("$DatabaseNameRoot")){
    		$ret = @{
    			InactiveDatabase = $DatabaseNameRoot + '-2'
    			ActiveDatabase = $DatabaseNameRoot
    		}
    	}else{
            throw "unable to find $DatabaseNameRoot OR $DatabaseNameRoot-2"
        }
    	return $ret
    }
    
    

    Step 1

    Copy the Production slot to a Staging slot

    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName
    )
    . "$PSScriptRoot\Get-KuduUtility.ps1"
    $existingSlot = Get-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Staging" -ErrorAction SilentlyContinue
    if ($null -ne $existingSlot){
        write-host "Removing existing Staging slot"
        Remove-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Staging" -Force
        Start-Sleep -s 10
    }
    $slot = Get-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Production"
    New-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Staging" -AppServicePlan $slot.ServerFarmId
    
    Copy-AppServiceToStaging -ResourceGroupName $ResourceGroupName -WebAppName $AppServiceName
    

    Step 2

    Make copies of all of your content delivery databases and wire your CM environment to the new databases

    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName,
        [Parameter(Mandatory=$true)]
        [string]$CDAppServiceName,
        [string]$SlotName = string::Empty,
        [Parameter(Mandatory=$true)]
        [string]$DatabaseNameRoot,
        [Parameter(Mandatory=$true)]
        [string]$SqlServerName
    )
    
    . "$PSScriptRoot\Get-KuduUtility.ps1"
    $contents = (Get-FileFromWebApp -resourceGroupName $ResourceGroupName -webAppName $AppServiceName -slotName $SlotName -kuduPath "App_Config/ConnectionStrings.config") | Out-String
    $db = Get-DatabaseNames -ResourceGroupName $ResourceGroupName -AppServiceName $CDAppServiceName -DatabaseNameRoot $DatabaseNameRoot -SlotName $SlotName
    $contents = $contents.Replace("Catalog=$($db.ActiveDatabase);", "Catalog=$($db.InactiveDatabase);")
    
    
    $tst = Get-AzureRmSqlDatabase -DatabaseName $db.InactiveDatabase -ServerName $SqlServerName -ResourceGroupName $ResourceGroupName -ErrorAction SilentlyContinue
    if ($null -ne $tst){
        throw "Unable to copy database when the CM environment is referencing $($db.ActiveDatabase) and $($db.InactiveDatabase) already exist.  Make sure that both the tenant CD AND the CM environment are using the same database before this operation and delete the unused database and try again."
    }
    $tst = Get-AzureRmSqlDatabase -DatabaseName $db.ActiveDatabase -ServerName $SqlServerName -ResourceGroupName $ResourceGroupName -ErrorAction SilentlyContinue
    write-host "Copying database $($db.ActiveDatabase) to $($db.InactiveDatabase)"
    $parameters = @{
        ResourceGroupName = $ResourceGroupName
        DatabaseName = $db.ActiveDatabase
        ServerName = $SqlServerName
        CopyResourceGroupName = $ResourceGroupName
        CopyServerName = $SqlServerName
        CopyDatabaseName = $db.InactiveDatabase
    }
    if (-not [string]::IsNullOrWhitespace($tst.ElasticPoolName)){
        $parameters["ElasticPoolName"] = $tst.ElasticPoolName
    }
    New-AzureRmSqlDatabaseCopy @parameters
    Write-FileToWebApp -resourceGroupName $ResourceGroupName -webAppName $AppServiceName -fileContent $contents -slotName $SlotName -kuduPath "App_Config/ConnectionStrings.config"
    
    

    NOTE: This accesses the CD app service to determine the offline database. It toggles between {name} and {name}-2
    NOTE: DatabaseNameRoot refers to the database name without the -2 on it.
    NOTE: This assumes the assets all share a resource group, if that’s not true, add some more parameters

    Step 3

    Copy all content delivery app services to a Staging slot.

    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName
    )
    . "$PSScriptRoot\Get-KuduUtility.ps1"
    $existingSlot = Get-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Staging" -ErrorAction SilentlyContinue
    if ($null -ne $existingSlot){
        write-host "Removing existing Staging slot"
        Remove-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Staging" -Force
        Start-Sleep -s 10
    }
    $slot = Get-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Production"
    New-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot "Staging" -AppServicePlan $slot.ServerFarmId
    
    Copy-AppServiceToStaging -ResourceGroupName $ResourceGroupName -WebAppName $AppServiceName
    

    NOTE: This assumes that your deployment process deploys all of Sitecore except things in App_Data and environment specific App_Config config files like ConnectionStrings.config

    Step 4

    Execute a deploy as you normally would while targeting the production slot for CM and the staging slot for CD.

    Step 5

    Test your changes on CM and the CD Staging slots

    NOTE: these scripts assume that your deployment process handles all non-dynamically generated assets. So things like the Sitecore folder would be included in your deployment process whereas things like your license.xml or ConnectionStrings.config would not be. These things would be handled by the app service copy in Step 3
    NOTE: if you need to hotfix, you can repeat step 4

    Step 6

    Swap production slot and staging slots in your CD app services

    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName,
        [string]$SlotName = "Staging"
    )
    
    Switch-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -DestinationSlotName "Production" -SourceSlotName $SlotName
    

    NOTE: At this time your change is live and there has been no downtime

    Step 7

    Clean up.
    It’s important to note that you can delay this step to provide a rapid rollback if needed. Before completing this step, swapping the slots again will give us a rollback in seconds.
    Remove the old content delivery databases.

    
    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName,
        [Parameter(Mandatory=$true)]
        [string]$SqlServerName,
        [Parameter(Mandatory=$true)]
        [string]$DatabaseNameRoot,
        [switch]$DeleteActive = $false,
        [string]$SlotName = ""
    )
    . "$PSScriptRoot\Get-KuduUtility.ps1"
    
    $db = Get-DatabaseNames -ResourceGroupName $ResourceGroupName -AppServiceName $AppServiceName -DatabaseNameRoot $DatabaseNameRoot -SlotName $SlotName
    
    if ($DeleteActive){
        Remove-AzureRmSqlDatabase -ResourceGroupName $ResourceGroupName -ServerName $SqlServerName -DatabaseName $db.ActiveDatabase -Force
    }else{
        Remove-AzureRmSqlDatabase -ResourceGroupName $ResourceGroupName -ServerName $SqlServerName -DatabaseName $db.InactiveDatabase -Force
    }
    

    Remove the staging slots for each environment, CD app services and CM app service

    param(
        [string]$ResourceGroupName,
        [string]$AppServiceName,
        [string]$SlotName
    )
    
    Remove-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot $SlotName -Force
    

    That’s it, you’re done and there was no downtime, feels good doesn’t it?

    What if we need to roll back?

    If you’ve gotten to step 7 and find that the code is flawed and won’t be able to hotfix or you need an emergency content change this is how you roll back.

    Step 1

    Swap the CM staging slot to production (this will contain the database connections to the old database)

    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName,
        [string]$SlotName = "Staging"
    )
    
    Switch-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -DestinationSlotName "Production" -SourceSlotName $SlotName
    
    

    Step 2

    Then delete the NEW databases you created in step 2

    param(
        [Parameter(Mandatory=$true)]
        [string]$ResourceGroupName,
        [Parameter(Mandatory=$true)]
        [string]$AppServiceName,
        [Parameter(Mandatory=$true)]
        [string]$SqlServerName,
        [Parameter(Mandatory=$true)]
        [string]$DatabaseNameRoot,
        [switch]$DeleteActive = $false,
        [string]$SlotName = string::Empty
    )
    . "$PSScriptRoot\Get-KuduUtility.ps1"
    
    $db = Get-DatabaseNames -ResourceGroupName $ResourceGroupName -AppServiceName $AppServiceName -DatabaseNameRoot $DatabaseNameRoot -SlotName $SlotName
    
    if ($DeleteActive){
        Remove-AzureRmSqlDatabase -ResourceGroupName $ResourceGroupName -ServerName $SqlServerName -DatabaseName $db.ActiveDatabase -Force
    }else{
        Remove-AzureRmSqlDatabase -ResourceGroupName $ResourceGroupName -ServerName $SqlServerName -DatabaseName $db.InactiveDatabase -Force
    }
    

    Step 3

    then remove the stanging slots

    
    param(
        [string]$ResourceGroupName,
        [string]$AppServiceName,
        [string]$SlotName
    )
    
    Remove-AzureRmWebAppSlot -ResourceGroupName $ResourceGroupName -Name $AppServiceName -Slot $SlotName -Force
    

    Step 4

    Finally you’d run either a TDS sync or Unicorn sync to get your developer owned assets back to pre-deployment state.

    Solving secondary issues

    Search index

    For the search index you’ll want to treat it in the same way we treated Databases. At the start of the blue/green process we’ll create a new one as a clone from the production facing one and rewire the CM environment and CD staging slots to use the new Search index.

    You would do this if you use the search index to source content to end users. Primarily you’d see this in a site search. The issue would be if you add pages to your offline web database the search index would pick those up and possibly return links to end users that are 404s or if the code tries to get the Sitecore item you may end up with null exceptions.

    This is optional because with proper governance you can mitigate the risk. Things like content freezes or publishing freezes would work fine.

    With this safely implemented you could potentially lift any content author freezes. As long as the authors know that during a blue/green deployment their changes are published to the offline environment until the swap is completed.

    XConnect

    You would want to have 2 parallel XConnect environments, one that’s always customer facing and one that’s always not customer facing. During a blue/green deployment you’d want to have your CM environment and the CD staging slots pointed at the offline always XConnect environment. Then immediately before the swap you’ll want to rewire the CM and CD staging slots to point to the online only XConnect environment.

    You would do this if you wanted to be certain that there was no testing data in XConnect.

    This is optional because most people wouldn’t mind a bit of testing data in their analytics data.