Simplified Read Only Sitecore Data Provider

I’ve found that many clients don’t use Sitecore to source all their content.  Often times we find ourselves needing to pull content into Sitecore from an external source.  Sitecore offers a very extensive suite of tools to accomplish this that they call Data Providers.  Using these tools you can not only utilize external content, but also modify it at the source.  While this is very cool, I find it unnecessary most of the time.   Additionally, implementing the data provider is not always straightforward.  These two reason are why I’ve created a wrapper around the data provider to make it easy to quickly stand up a read only provider.

What you need to know about a data provider

Data providers use something that Sitecore calls the “IDTable”.  This can be thought of as a mapping between a primary key that exists outside of Sitecore and a Sitecore GUID.  In order for this to work, your data needs to have a unique identifier associated with it.  In conjunction with this, Sitecore requires what they call a “prefix” which can just be thought of as an overarching category for your primary key.  This should be something as simple as “productData” if your provider is for product data.

Every data provider needs a Sitecore sourced root element.  Before you wire up the data provider, you’ll need to create an item to serve as the root and make sure it exists in each environment before activating the data provider.

How it works

It starts with an interface that corresponds to a bare bones Sitecore item:

    public interface IBasicData
    {
        ID TemplateID {get;}
        string DisplayName { get;  }
        string Name { get; }
        string Key { get; }
        string Prefix { get; }
        Dictionary<string, string> Fields { get; }
        ID ParentID { get; }
        IEnumerable<IBasicData> GetChildren { get;}
        
    }

First, you create a concrete class that implements IBasicData. Below I have an example for a file system provider. This provider simply creates a Sitecore item corresponding to the items in the file system at the site root.

Secondly, you create a Sitecore template that has all the fields you’ll need.  For my simple example, I’ll just have three fields “Name”, “Size”, and “Full Path” all of which are single line text fields.

Here’s what my file system data concrete class looks like:

    class FileData : IBasicData
    {
        private string _path;
        private ID _parent;
        private string _name;
        public FileData(string path, ID parent)
        {
            _path = path;
            _parent = parent;
            _name = _path.Split('\\').Last();
        }
        public Sitecore.Data.ID TemplateID
        {
            get { return new ID("{D67A9C76-1F6B-497D-8C0A-75D1003EFF73}"); }
        }

        public string DisplayName
        {
            get { return _name; }
        }

        public string Name
        {
            get { return _name; }
        }

        public string Key
        {
            get { return _path; }
        }

        public string Prefix
        {
            get { return "fileData"; }
        }

        public Dictionary<string, string> Fields
        {
            get
            {
                return new Dictionary<string, string>(){
                    {"Name",_name},
                    {"Size", File.Exists(_path) ? new FileInfo(_path).Length.ToString() : "0"},
                    {"Full Path", _path}
                };
            }
        }

        public Sitecore.Data.ID ParentID
        {
            get { return _parent; }
        }

        public IEnumerable<IBasicData> GetChildren
        {
            get {
                if (Directory.Exists(_path))
                {
                    foreach (string path in Directory.GetFiles(_path))
                        yield return new FileData(path, IDTable.GetID(Prefix, Key).ID);
                    foreach (string path in Directory.GetDirectories(_path))
                        yield return new FileData(path, IDTable.GetID(Prefix, Key).ID);
                }
            }
        }
    }

Thirdly, I’ve created an abstract class to operate as the data provider that needs to be implemented.
Here is the abstract class:

    public abstract class BasicDataProvider<T> : Sitecore.Data.DataProviders.DataProvider
    {
        private ID RootId;
        //cache the items that are already processed
        private Dictionary<ID, T> ItemTracker = new Dictionary<ID, T>();
        public BasicDataProvider(ID RootId)
        {
            this.RootId = RootId;
        }
        public BasicDataProvider() { }
        /// <summary>
        /// gets the item or set of items that represent the root items of this data provider
        /// </summary>
        /// <returns>IEnumerable of generic items</returns>
        abstract public IEnumerable<IBasicData> GetRootItems();
        /// <summary>
        /// gets the external data unique key
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>unique key</returns>
        abstract public string GetKey(T data);
        /// <summary>
        /// gets the name for the data
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>string name</returns>
        abstract public string GetName(T data);
        /// <summary>
        /// gets the ID for the sitecore template to house the data
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>template id</returns>
        abstract public ID GetTemplateID(T data);
        /// <summary>
        /// gets the ID of the parent item above the current location
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>parent item id</returns>
        abstract public ID GetParent(T data);
        /// <summary>
        /// gets the sitecore display name of the data
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>string to identfy the data in the sitecore content tree</returns>
        abstract public string GetDisplayName(T data);
        /// <summary>
        /// gets a list of children if any from the current data item
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>list of children</returns>
        abstract public IEnumerable<IBasicData> GetChildren(T data);
        /// <summary>
        /// gets all the fields as they pertain to the data item
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>Dictionary of fields to values</returns>
        abstract public IDictionary<string, string> GetFields(T data);
        /// <summary>
        /// gets the custom prefix for the data to be injected into the ID table
        /// </summary>
        /// <param name="data">data object</param>
        /// <returns>prefix string</returns>
        abstract public string GetPrefix(T data);
        public override Sitecore.Collections.IDList GetChildIDs(Sitecore.Data.ItemDefinition itemDefinition, CallContext context)
        {
            IEnumerable<IBasicData> children = null;
            if (itemDefinition.ID == RootId)
                children = GetRootItems();
            else if (IsTracking(itemDefinition.ID))
                children = GetChildren(ItemTracker[itemDefinition.ID]);
            if (children != null)
            {
                IDList ret = new IDList();
                foreach (T child in children)
                {
                    ID itemKey = GetOrAddID(GetKey(child), child);
                    if (!ItemTracker.ContainsKey(itemKey))
                        ItemTracker.Add(itemKey, child);
                    ret.Add(itemKey);
                }

                return ret;
            }
            return base.GetChildIDs(itemDefinition, context);
        }
        public override ItemDefinition GetItemDefinition(ID itemId, CallContext context)
        {
            if (IsTracking(itemId))
            {
                T item = ItemTracker[itemId];
                ItemDefinition ret = new ItemDefinition(itemId, GetName(item), GetTemplateID(item), ID.Null);
                return ret;
            }
            return base.GetItemDefinition(itemId, context);
        }
        public override ID GetParentID(ItemDefinition itemDefinition, CallContext context)
        {
            if (IsTracking(itemDefinition.ID))
            {
                T item = ItemTracker[itemDefinition.ID];
                return GetParent(item);
            }
            return base.GetParentID(itemDefinition, context);
        }
        public override FieldList GetItemFields(ItemDefinition itemDef, VersionUri version, CallContext context)
        {
            if (IsTracking(itemDef.ID))
            {
                T item = ItemTracker[itemDef.ID];
                var fields = GetFields(item);
                ID TemplateID = GetTemplateID(item);
                TemplateItem curTemplate = Database.GetDatabase("master").GetTemplate(TemplateID);

                FieldList ret = new FieldList();
                foreach (string key in fields.Keys)
                {
                    TemplateFieldItem fi = curTemplate.GetField(key);
                    if (fi != null)
                        ret.Add(fi.ID, fields[key]);
                }
                return ret;
            }
            return base.GetItemFields(itemDef, version, context);
        }

        private bool IsTracking(ID itemId)
        {
            return ItemTracker.ContainsKey(itemId);
        }
        public ID GetOrAddID(string key, T item)
        {
            if (IDTable.GetID(GetPrefix(item), key) != null)
                return IDTable.GetID(GetPrefix(item), key).ID;
            else
                return IDTable.GetNewID(GetPrefix(item), key).ID;
        }
    }

Below is an example of an implementation for the filesystem data provider.  Note that the constructor is indicating that all our data provider created Sitecore items are targeted to be created under the item at GUID {CB88F34E-6D3D-4D81-ADD5-DD36E22328F5} that was created for this example.

    class FileProvider : BasicDataProvider<FileData>
    {
        public FileProvider():base(new ID("{CB88F34E-6D3D-4D81-ADD5-DD36E22328F5}"))
        {}
        public override IEnumerable<IBasicData> GetRootItems()
        {
            foreach (string path in Directory.GetFiles(System.Web.HttpRuntime.AppDomainAppPath))
                yield return new FileData(path, new ID("{CB88F34E-6D3D-4D81-ADD5-DD36E22328F5}"));
            foreach (string path in Directory.GetDirectories(System.Web.HttpRuntime.AppDomainAppPath))
                yield return new FileData(path, new ID("{CB88F34E-6D3D-4D81-ADD5-DD36E22328F5}"));
        }

        public override string GetKey(FileData data)
        {
            return data.Key;
        }

        public override string GetName(FileData data)
        {
            return data.Name;
        }

        public override Sitecore.Data.ID GetTemplateID(FileData data)
        {
            return data.TemplateID;
        }

        public override Sitecore.Data.ID GetParent(FileData data)
        {
            return data.ParentID;
        }

        public override string GetDisplayName(FileData data)
        {
            return data.DisplayName;
        }

        public override IEnumerable<IBasicData> GetChildren(FileData data)
        {
            return data.GetChildren;
        }

        public override IDictionary<string, string> GetFields(FileData data)
        {
            return data.Fields;
        }

        public override string GetPrefix(FileData data)
        {
            return data.Prefix;
        }
    }

There we have it, now lets wire it up

Data providers need to be added to the web.config or patched in.  Since everything we do should always be patched in, I did it that way for my filesystem data provider.

<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <dataProviders>
      <FileProvider type="MYNAMESPACE.FileProvider, MYBINARY">
      </FileProvider>
    </dataProviders>
    <databases>
      <database id="master" singleInstance="true" type="Sitecore.Data.Database, Sitecore.Kernel">
        <dataProviders>
          <dataProvider patch:before="*[1]" ref="dataProviders/FileProvider"/>
        </dataProviders>
      </database>
    </databases>
  </sitecore>
</configuration>

And there we have it. If everything went as planned you should have a whole bunch of files and folders represented as Sitecore items wherever you targeted as your root.

Making it better

I like to view this as an excellent place to start, things can always be improved and I’d like to hear any concerns or thoughts.  Please feel free to leave a comment.