Unicorn Configuration Validation

unicornn
Unicorn is a great platform for tracking and propagating changes between environments. There is an incredible amount of flexibility in the configurations which is a great thing, however it’s really easy to mess up in an inconspicuous ways.

The most common ways I’ve seen that things get messed up are:

  • Dependencies are inaccurately defined
  • Paths tracked aren’t rooted in stock Sitecore items
  • Gaps in the tracked items exist (i.e. an untracked parent and tracked grandparent)
  • Typos

Unicorn is incredibly forgiving as far as handling these errors. There is an internal process that attempts to automatically apply dependencies implicitly that will most likely do the job for you. Often times these errors will not show up until someone tries to apply the configurations to an empty Sitecore install. This is definately not a situation you want to find yourself in while rebuilding a production environment in an emergency. It’s in your best interest to make sure that you’re building your tracked tree appropriately before it comes to that.

What can be done?

I propose a scripting solution. Sounds complicated? Yes it certainly was a challenging problem. When looking at unicorn configurations there’s a surprising amount of complexity that you need to take into.

  • Abstract configurations
  • Includes
  • Excludes
  • Excepts
  • Wildcards

It doesn’t take long for the configurations to become incredibly complex to manage.

Enter powershell

I’ve created a two step process to handle analyzing and reporting.

Step 1, building a model

The first step is to construct a trie style data structure to simulate a desired state of a unicorn configuration.
The script simply requires the root path which contains all your configuration files, it’ll automatically find all the Unicorn configurations wherever they are.
$tst = Get-UnicornModel.ps1 -Path “C:\Code\MyProject\ProjectRoot”

Gist for model building script found here

Powershellstructure
You can see here is a powershell object representation of /sitecore/layout/layouts
From here you can see some basic data about the desired state of your Sitecore environment.

  • Config – A list of configurations that track this item (it needs to be a list because multiple unicorn configs CAN track a single item
  • Node – Name of the node
  • Path – Path of the node
  • Parent – Object representation of parent node in the same format as this one
  • Database – Database the item is located in
  • Next – List of objects that represent the tracked children of this node

Important to note that this is in no way parsed from the yml files but rather implied by parsing configurations. Think of it as a desired state of your unicorn setup.

Step 2, analyzing the model

Once you have all this information at hand, it’s relatively easy to parse it looking for oddities. As it sits i am currently tracking three things.
Invoke-UnicornAssessment.ps1 -Trie $model -ErrorOnOrphans
Notice there are two switches to control the behavior of this script

  • -ErrorOnOrphans : Script throws an error when there are orphans detected
  • -ErrorOnDependencyMismatch : Script throws an error when the dependencies aren’t properly explicitly set

This is to control from your devops pipelines if you want to have this be a warning or a full stop error that halts progress.

Gist for analyzing script found here

  • Does a configuration have an unspecified dependency on another configuration?
  • Are there orphans? (item tracked, parent NOT tracked, grandparent tracked)
  • Root not part of default Sitecore

reportingerrors
Rootnodes
Using this model, the analysis can easily be expanded to track anything that you desire. For example, maybe you want to make sure that your developers aren’t tracking items under the home node, this can be easily implemented using this model

The end result

Now you can have confidence that your Unicorn configurations won’t have cronic problems as well as being able to finely control how Unicorn configurations are managed. Anyone, no matter your experience level with Unicorn can make mistakes setting this stuff up, now you can rest a little more easily :).

Sitecore Package Autoloader

 

Clients i have always ask for a method of associating content with a particular release.  Generally speaking we’d go with a Unicorn deploy once configuration, Sitecore Sidekick (which obviously i’m quite partial to), or a manual package install post-build step.  All of these methods have their strength and drawbacks.  The main drawback with these methods is controlling when items should be added.  For example if you want content to be added once and then never added again regardless it’s a hassle for Unicorn because you’ll need to remove the configuration tracking those items and it’s a hassle for Sidekick for the same reason, you’ll need to remove the scripting kicking off the content transfer for subsequent releases. Additionally, both these methods involve quite a bit of configuration scripting to both your solution and your build/release pipelines.

Automating packages

openbox

My proposal to fix this come in the form of automating package installation with a descriptor for when the package should be installed.  Based on a few techniques I’ve used in the past that have been pretty effective to make this process as simple and seamless as possible.

This is facilitated through a nuget package found here.

  1. No configuration needed (beyond the base configuration needed for the module).
  2. No additions to your build/release pipeline.
  3. Full control over when/if a package is installed.
  4. Full control over dependencies if you have packages that require other packages to be installed first
  5. Full control over the type of package install method being used

Using packages from an embedded resource

This method minimizes the effort needed to facilitate an automatic package deployment.  Note that this method will incur a memory cost based on the size of the package so be careful of your package sizes when using this method.

  1. The package is embedded in a dll file.
    1. This is nice because every build/release process ever handles dlls easily
  2. A descriptor (c# poco: PackageAutoloaderDescriptor) controls when a package should be applied
  3. The package is installed as part of the initialize pipeline so you’re guaranteed the package content is installed before Sitecore is usable

A simple Example

	public class DemoDescriptor : PackageAutoloaderDescriptor
	{
		public void Process(PipelineArgs args)
		{
		}
		public override string PackageNamespace => "PackageAutoloaderDemo.demo.zip";
		public override List Requirements => new List()
		{
			new DescriptorItemRequirements()
			{
				Database = "master",
				ItemId = new ID("{76036F5E-CBCE-46D1-AF0A-4143F9B557AA}")
			}
		};


	}

Using packages from the filesystem

Thanks to Robin Hermanussen for the comment suggesting to add this feature.

This allows automatically installing packages in the same way as above except instead of referencing a namespace for an embedded resource package you can have it install a package from a filepath.  This saves on memory consumption and allows us to install much larger files without worry.  Note that this method will need to have a build/release method of delivering the Sitecore package to your server

  1. The package is delivered to the server
    1. This can be located anywhere, not simply the App_Data/packages folder that standard Sitecore uses
  2. A descriptor (c# poco: PackageFileloaderDescriptor) controls when a package should be applied
  3. The package is installed as part of the initialize pipeline so you’re guaranteed the package content is installed before Sitecore is usable

A simple example

	public class DemoDescriptor2 : PackageFileLoaderDescriptor
	{
		public override IItemInstallerEvents ItemInstallerEvents => 
			new DefaultItemInstallerEvents(new BehaviourOptions(InstallMode.Overwrite, MergeMode.Undefined));

		public override List Requirements => new List()
		{
			new DescriptorItemRequirements()
			{
				Database = "master",
				ItemId = new ID("{190B1C84-F1BE-47ED-AA41-F42193D9C8FC}")
			}
		};

		public override string RelativeFilePath => "/PackageAutoloader/demo2.zip";
	}

Usage instructions

You can read the documentation on setting up Package Autoloader here

SXA Advanced Dictionary

Download the helix foundation project HERE

How do you handle basic content snippets?

google-stop-words
People generally have strong opinions on where simple phrases or single words should be stored in order to properly localize them. This is normally 2 camps.

Store simple content in standard values

One camp that stores simple phrases or single words in the standard values of the templates, this allows for more flexibility on a case by case basis but makes it hard to change them wholesale.

Using stock Dictionaries

The second camp is to use the dictionary, but that comes with it’s own problems, particularly for keeping helix pure and having a component own it’s own Sitecore Items, additionally in SXA you need to worry about utilizing components across different sites and tenants that own a completely different dictionary location.

Either way is not very good

both of these options come with pretty serious problems that place significant tech debt style nastiness on the content authors in terms of flexibility

Enter the AutoDictionary

robot

Automatically creates your dictionary items if they don’t exist.

Allows authors to optionally edit the dictionary items from the EE.

SXA site component sharing automatically handled.

Traditionally a dictionary key will be pathed using .‘s like so:

Carousel.Labels.Next

This would look for the dictionary definition with that key. Traditionally it would be located at the path:

Dictionary/Carousel/Labels/Next

Using this information we know where the dictionary definition SHOULD be.
With the addition of a default text block we can create these dictionary items automatically. whereas a traditional dictionary would output nothing.

<span class="btn">@Html.AutoTranslate("Carousel.Labels.Next", "Next")</span>

The below would be EE authorable

<span class="btn">@Html.AutoTranslate("Carousel.Labels.Next", "Next", true)</span>

Sitecore Analytics Errors

ERROR [Experience Analytics]: System.Net.WebException: The remote name could not be resolved: 'reportingserviceurl'
   at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context)
   at System.Net.HttpWebRequest.GetRequestStream()
   at Sitecore.Xdb.Reporting.Datasources.Remote.RemoteReportDataSourceProxy.GetData(ReportDataQuery query)
   at Sitecore.Xdb.Reporting.ReportDataProvider.ExecuteQueryWithCache(ReportDataQuery query, ReportDataSource dataSource, CachingPolicy cachingPolicy)
   at Sitecore.Xdb.Reporting.ReportDataProvider.GetData(String dataSourceName, ReportDataQuery query, CachingPolicy cachingPolicy)
   at Sitecore.ExperienceAnalytics.Core.Repositories.SiteRemoteReader.GetEntities(String sqlQuery)
   at Sitecore.ExperienceAnalytics.Core.Repositories.SiteRemoteReader.GetAll(NameValueCollection readingPreferences)
   at Sitecore.ExperienceAnalytics.Core.Repositories.CachedReaderDecorator`2.GetAll(NameValueCollection readingPreferences)
   at Sitecore.ExperienceAnalytics.Core.Repositories.SiteFilter.FilterReaderDecorator`2.GetAll(NameValueCollection readingPreferences)
   at Sitecore.ExperienceAnalytics.Client.RenderingHelper.GetSiteComboBoxItems()

if you’re getting this error message, it’s likely that your configurations are missing the URL to the reporting service.

On the CM server modify the configuration file at:
\wwwroot\App_Config\Sitecore\Azure\Sitecore.Xdb.Remote.Client.CM.config

notice that there are 2x spots for URLs. If those locations have a dummy placeholder URLs then something went awry with the original setup. Instead replace the placeholder urls with your rep and prc service urls.

Azure Search Missing Target Dropdown

Missing options in the target dropdown for the general link’s internal link form? The options are sourced by the search index for some reason.

check if a simple reindex of your core index will do the trick

If you’ve already tried that and still no dice, you may run into the same issue i did. After going to Sitecore Support i got a few good pieces of information.

  1. In order to use Azure search in Sitecore you need to limit the fields indexed by Sitecore. Typically done with <indexAllFields>false</indexAllFields>
  2. There are some fields required by SPEAK to make these forms work properly

The Solution

There are a few templates and fields that need to be available for this functionality to work properly. Make sure your solution has these standard configuration nodes set up.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:search="http://www.sitecore.net/xmlconfig/search/">
  <sitecore role:require="ContentManagement or ContentDelivery" search:require="azure">
    <contentSearch>
      <indexConfigurations>
        <defaultCloudIndexConfiguration>
          <documentOptions>
            <include hint="list:AddIncludedTemplate">
              <StandardTemplate>{1930BBEB-7805-471A-A3BE-4858AC7CF696}</StandardTemplate>
              <CommonText>{76F63DF7-0235-4164-86AB-84B5EC48CB2A}</CommonText>
            </include>
            <include hint="list:AddIncludedField">
              <fieldId>{8CDC337E-A112-42FB-BBB4-4143751E123F}</fieldId>
              <hidden>{39C4902E-9960-4469-AEEF-E878E9C8218F}</hidden>
            </include>
          </documentOptions>
        </defaultCloudIndexConfiguration>
      </indexConfigurations>
    </contentSearch>
  </sitecore>
</configuration>

Azure Search replication

If you’re trying to get a geo-replicated disaster recovery site set up and you’re using Azure Search you likely ran into the same issue that i did. Azure Search simply does not have the geo-replication tools or abilities that SQL does. This becomes all the more frustrating by the fact that it’s literally the only PAAS element in the Sitecore ecosystem that doesn’t have this functionality. If you don’t have the luxury of being able to re-index your data rapidly, you’re stuck waiting for the data to index. In the context of Sitecore this can take several hours on particularly large sites.

Additionally this can be problematic when dealing with Blue/Green deployments as customer facing content could and should be included in your search index. This problem can be solved in a similar fashion. When added to this method of zero downtime deployments it can give a more complete and safe deployment.

Using a Azure Search Index as a source

Any data processing you needed to do to populate your primary index can be skipped if you simply utilize one main azure search index as the source for the second. I have this brokered through an Azure Function.
AzureFunctionReplicationFlow

The code for the function is as follows:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.IO;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Logging;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using Microsoft.Azure.Search;
using Microsoft.Azure.Search.Models;


namespace BendingSitecore.Function
{
    public static class AzureSearchReplicate
    {
        [FunctionName("AzureSearchReplicate")]
        public static async Task Run(
            [HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req,
            ILogger log)
        {
	        string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
            dynamic data = JsonConvert.DeserializeObject(requestBody);

			IEnumerable indexes = Enumerable.Empty();
			if (data.indexes != null){
				indexes = ((JArray)data.indexes).Select(x => (string)x);
			} 
	        try
	        {
		        Start(new SearchServiceClient(data.source.ToString(), new SearchCredentials(data.sourceKey.ToString()))
			        , new SearchServiceClient(data.destination.ToString(), new SearchCredentials(data.destinationKey.ToString())),false, log, indexes ?? Enumerable.Empty());
			}
	        catch (Exception e)
	        {
				log.LogError(null, e, "An Error occurred");
		        return new BadRequestObjectResult("Require a json object with source, destination and keys.");
			}

	        return  new OkObjectResult($"Azure Search replication is running, should be finished in about 10 minutes.");
        }
		public static void Start(SearchServiceClient source, SearchServiceClient destination, bool wait,
            ILogger log, IEnumerable indexes)
		{
			List tasks = new List();
			ClearAllIndexes(destination, indexes);
			foreach (var index in source.Indexes.List().Indexes.Where(x => !indexes.Any() || indexes.Any(i => i.StartsWith(x.Name))))
			{
				tasks.Add(Task.Run(async () =>
				{
					try
					{
						destination.Indexes.Get(index.Name);
					}
					catch (Exception e)
					{
						log.LogInformation($"creating index {index.Name}", null);
						destination.Indexes.Create(index);
						await Task.Delay(5000);
					}
					await MigrateData(source.Indexes.GetClient(index.Name),
						destination.Indexes.GetClient(index.Name), log);
				}));
			}
			if (wait)
			{
				foreach (var task in tasks)
				{
					task.Wait();
				}
			}
		}

		public static void ClearAllIndexes(SearchServiceClient client, IEnumerable indexes)
		{
			foreach (var index in client.Indexes.List().Indexes.Where(x => !indexes.Any() || indexes.Any(i => i.StartsWith(x.Name))))
			{
				client.Indexes.Delete(index.Name);
			}
		}

		public static async Task MigrateData(ISearchIndexClient source, ISearchIndexClient destination,
            ILogger log)
		{
			log.LogInformation($"Starting migration of data for {source.IndexName}", null);
			SearchContinuationToken token = null;
			var searchParameters = new SearchParameters { Top = int.MaxValue };
			int retryCount = 0;
			while (true)
			{
				DocumentSearchResult results;
				if (token == null)
				{
					results = await source.Documents.SearchAsync("*", searchParameters);
				}
				else
				{
					results = await source.Documents.ContinueSearchAsync(token);
				}
				try
				{
					await destination.Documents.IndexAsync(IndexBatch.New(GetAction(destination, results)));
				}
				catch (Exception e)
				{
					log.LogError(e, "Error occurred writing to destination", null);
					log.LogInformation("Retrying...", null);
					retryCount++;
					if (retryCount > 10){
						log.LogError("Giving up...", null);
						break;
					}
					continue;
				}
				if (results.ContinuationToken != null)
				{
					token = results.ContinuationToken;
					continue;
				}

				break;
			}
			log.LogInformation($"Finished migration data for {source.IndexName}", null);
		}

		public static IEnumerable<IndexAction> GetAction(ISearchIndexClient client, DocumentSearchResult documents)
		{
			return documents.Results.Select(doc => IndexAction.MergeOrUpload(doc.Document));
		}
    }
}

Additionally make sure your Azure function has these configuration settings.


        AzureWebJobDashboard                     = "DefaultEndpointsProtocol=https;AccountName=$storageName;AccountKey=$accountKey"
        AzureWebJobsStorage                      = "DefaultEndpointsProtocol=https;AccountName=$storageName;AccountKey=$accountKey"
        FUNCTIONS_EXTENSION_VERSION              = "~2"
        FUNCTIONS_WORKER_RUNTIME                 = "dotnet"
        WEBSITE_NODE_DEFAULT_VERSION             = "8.11.1"
        WEBSITE_RUN_FROM_PACKAGE                 = "1"
        WEBSITE_CONTENTAZUREFILECONNECTIONSTRING = "DefaultEndpointsProtocol=https;AccountName=$storageName;AccountKey=$accountKey"
        WEBSITE_CONTENTSHARE                     = "$storageName"
        AzureWebJobsSecretStorageType            = "Files"

Running your function

Execute the function code using a raw json request body like so:

{
    "destination":  "[standby azure search name]",
    "destinationKey":  "[standby azure search key]",
    "source":  "[primary azure search name]",    
    "sourceKey":  "[primary azure search key]",
    "indexes":  null
}

Note: If you want to specify a particular index to manage you may pass in a json array of indexes to manage. It will only clear/refresh the indexes specified or if null it will clear/refresh all indexes

Automating

Through powershell there are ways to create and execute the Azure function given a valid Azure context and some desired names and resource groups. Expect to see a blog post on that shortly in the future.

Dude, Where’re my logs? (Azure)

If you’re new to the world of Sitecore in Azure PaaS then there’s a good chance that you popped open kudu and browsed to the App_Data/Logs folder and said to yourself “oh yeah, it’s in Application Insights or something…”. Then after going to Application insights and pushing buttons haphazardly arrived at something that kind of looks like log. It can be confusing and concerning to feel like you have an inability to debug problem. I’m going to go over the various ways of retrieving debug information for your Sitecore App Services.

Application Insights

This is where the vast majority of your logs are going to be, it’s not a great format and leaves me wanting more from the tool, but here’s how you use it:

  1. Navigate to your sites Application insights Azure resource
  2. In the Overview tab select the Analytics buttonAIAnalytics
  3. Under the table traces execute a query
  4. traces
    | where customDimensions.Role == “CM” and severityLevel == 3
    AIQuery
  5. The results will not be ordered properly, make sure you click the column header for timestamp to order by date
  6. Application insights has some handy auto-complete features to help you build a custom query to get exactly the data you’re looking for

NOTEWhile Application insights provides a good way to track and query log data, there does seem to be particular cases where the application does not properly submit log data to Application Insights. This leads us to the next Method.

Log Streaming

A more root level logging solution is the log streaming option offered by the App Service. This can provide a more reliable but less pleasant source of logs, this is good if you have an easily reproducible scenario. This will provide appropriate data in a more traditional format that many Sitecore developers will be more comfortable with. This option can give you more accurate and complete logging. It is important to note however that the logs get placed on the filesystem, so that will effect your filesystem size.

  1. Open the Diagnostics logs tab and turn on all the streaming logs settings.DiagnosticsLogs.png
  2. In the log stream you will now see logs coming in at real timeLogStream

Application logs

Some IIS level events and errors will find their way into the underlying filesystem, you can use Kudu to access them.

  1. First you need to access Kudukudu.png
  2. Using either the cmd or powershell Debug console navigate to D:\home\LogFiles and open eventlog.xmleventlog
  3. Here you will find IIS events and errors that may uncover more catastrophic errors that fail to be recorded in Application Insights

Azure App Service Logging

Sometimes despite all other options the problem persists, this is when we must view Azure health as on occasion without notification Azure events will impact our environments negatively.

  1. On the App Service select the Diagnose and solve problems tabDiagnoseAndSolve
  2. There are several reports in this interface that are definately worth an in depth look. I’ll focus on the Web App Restarted report.
    If you find that your app pool seems to be recycling too often, this is probably where you need to look.webappRestarted
  3. This report will give you any reason that Azure will have restarted your App Service