Archive Unused Items

In a sufficiently old Sitecore site, there is going to be old content that is just taking up space.  Often times our content people don’t want to simply delete it, and manually archiving is tedious to say the least.  What i propose is simple, use the links database to effectively identify what content is no longer being used.

I wanted to implement a block of generic business logic that would satisfy most if not all use cases, this is what i came up with:

  1. Identify the root item that you need to archive the children of
  2. Use depth first tree parsing to identify items to be archived
    1. An item may have no links but have children that do, thus making it not suitable for archiving
  3. Each item to be archived has a structure build out of common folders inside the archive root to mirror where it lives in the content tree
  4. items to archive are moved to the archive root, preserving IDs
    1. If previous archive operations identified the item to be archived as unsuitable for archiving thus making a common folder placeholder.  Since the item is now suitable for archiving it is moved to the archive folder and all the child items of the archive placeholder folder are moved to the archived actual content item.
    2. The archive placeholder folder is then removed in this case.

Here is a simple example.  Suppose we have a content structure such as this one below.

Start

There is a link reference to only one node “h”.  After running the Archiver we end up with this.

Finish

The archivable leaf nodes have been identified and extracted out.  Now what if we were to change the Linker to point to “c” instead of “h”?

Next

After we run the archive operation we end up with this.

Final

As you can see, the archive folders have been removed and the parent/child relationship has been restored.

Important notes!

This will only work if you have unique item names since it relies on paths and names to form a connection between the archive folder and the actual content.

The code is referring to an ArchiveFolderTemplateGuid.  In this example i have it pointing to a common folder, however i HIGHLY recommend that you create some sort of an archive folder template to use.  It simply needs to be template that the content tree doesn’t utilize.

Since this utilizes the link database it identifies unused content as something that doesn’t have any links to it from other items.  If you’re getting items without directly linking to them, say if you’re parsing through children of an item, then this won’t work or will need to be adjusted.

Show me the code

I’ve put this into an item extension, so to run this you simply need to get the root content item you’d like to archive and call item.ArchiveChildren()

 

	public static class ItemArchiveExtensions
	{
		private const string ArchiveFolderTemplateGuid = "{A87A00B1-E6DB-45AB-8B54-636FEC3B5523}";

		public static void ArchiveChildren(this Item archiveRoot)
		{
			Item itemRootToArchive = archiveRoot;
			var archiveFolder = archiveRoot.Database.GetItem(itemRootToArchive.Paths.FullPath + "/Archive");
			if (archiveFolder == null)
			{
				archiveFolder = itemRootToArchive.Add("Archive", new TemplateID(new ID(ArchiveFolderTemplateGuid)));
			}

			var archiveData = new ArchiveData()
			{
				Archive = archiveFolder,
				Root = itemRootToArchive,
				ArchiveableItems = GetItemsToArchive(itemRootToArchive, archiveFolder)
			};
			MoveToArchive(archiveData);
		}

		///
<summary>
		/// Gets an enumeration of the callout items that have no links
		/// </summary>

		/// <param name="root">The current node we're focusing on</param>
		/// <param name="archiveFolder">The folder that was identified to store our archived content</param>
		/// <param name="tracker">used to quickly idenify items that have been added to the list to archive</param>
		/// <returns></returns>
		private static List<Item> GetItemsToArchive(Item root, Item archiveFolder, HashSet<ID> tracker = null)
		{
			if (tracker == null)
				tracker = new HashSet<ID>();
			List<Item> ret = new List<Item>();
			bool addThis = true;
			if (root.Children.Any())
			{
				foreach (Item child in root.Children)
				{
					if (child.ID == archiveFolder.ID)
						continue;
					var archiveable = GetItemsToArchive(child, archiveFolder, tracker);
					if (!tracker.Contains(child.ID))
						addThis = false;
					ret = ret.Union(archiveable).ToList();
				}
			}
			else
			{
				if (!Globals.LinkDatabase.GetReferrers(root).Any())
				{
					ret.Add(root);
					tracker.Add(root.ID);
				}
				return ret;
			}
			if (addThis && !Globals.LinkDatabase.GetReferrers(root).Any())
			{
				ret = new List<Item>() { root };
				tracker.Add(root.ID);
			}
			return ret;
		}

		///
<summary>
		/// Moves an item to the archive folder
		/// </summary>

		/// <param name="data">The archive data that needs to move</param>
		private static void MoveToArchive(ArchiveData data)
		{
			foreach (var item in data.ArchiveableItems)
			{
				var toArchivePath = item.Paths.FullPath;
				var rootPath = data.Root.Paths.FullPath;
				var subPathFromRoot = toArchivePath.Substring(rootPath.Length, toArchivePath.LastIndexOf('/') - rootPath.Length);
				// if the item was previously used as a folder, we need to swap the folder with the archived item
				var archivedFolder =
					item.Database.GetItem(data.Archive.Paths.FullPath + '/' +
										  toArchivePath.Substring(rootPath.Length, toArchivePath.Length - rootPath.Length));
				var archiveParent = GetOrCreateItemAtPath(subPathFromRoot, data.Archive);
				item.MoveTo(archiveParent);
				CleanUpArchive(archivedFolder.Parent);
			}
		}

		///
<summary>
		/// Cleans out the depricated folders previously used by the archiver that are no longer needed
		/// </summary>

		/// <param name="archivedFolder">The Item in which we're cleaning</param>
		private static void CleanUpArchive(Item archivedFolder)
		{
			if (archivedFolder == null)
				return;
			Stack<Item> toClean = new Stack<Item>();
			toClean.Push(archivedFolder);
			while (toClean.Any())
			{
				var cur = toClean.Pop();
				Dictionary<string, Item> tracker = new Dictionary<string, Item>();
				foreach (Item child in cur.Children)
				{
					toClean.Push(child);
					if (tracker.ContainsKey(child.Name))
					{
						Item content;
						Item folder;
						if (child.TemplateID.ToString() == ArchiveFolderTemplateGuid)
						{
							folder = child;
							content = tracker[child.Name];
						}
						else
						{
							content = child;
							folder = tracker[child.Name];
						}
							foreach (Item archived in folder.Children)
								archived.MoveTo(content);
							folder.Delete();
					}
					else
					{
						tracker.Add(child.Name, child);
					}
				}

			}
		}

		///
<summary>
		/// Gets or creates the path for the archive
		/// </summary>

		/// <param name="subpathFromRoot">sub-path to the callout starting from the callout folder</param>
		/// <param name="archive">The root archive item</param>
		/// <returns></returns>
		private static Item GetOrCreateItemAtPath(string subpathFromRoot, Item archive)
		{
			var db = Factory.GetDatabase("master");
			Item exists = db.GetItem(archive.Paths.FullPath + subpathFromRoot);
			if (exists != null) return exists;
			Item ret = archive;
			string[] parts = subpathFromRoot.Split('/');
			for (int i = 1; i < parts.Length; i++)
			{
				var cur = db.GetItem(ret.Paths.FullPath + '/' + parts[i]);
				if (cur == null)
					ret = ret.Add(parts[i], new TemplateID(new ID(ArchiveFolderTemplateGuid)));
				else
					ret = cur;
			}
			return ret;
		}

		private class ArchiveData
		{
			public Item Root { get; set; }
			public Item Archive { get; set; }
			public IEnumerable<Item> ArchiveableItems { get; set; }
		}
	}