Introduction

In SharePoint 2013 the Distributed Cache plays a very important role, it is a key component for performance and caching. An incorrectly configured or managed Distributed Cache will cause issues, with your farm. I’ve even seen blogs recommending turning it off, most likely due to that they don’t manage the cache properly and get into a situation where it causes even worse performance problems. One of the good things with the Distributed Cache is that is not a SharePoint service, it is a standalone service called AppFabric 1.1 for Windows Server. Initially the guidance from Microsoft was that the Distributed Cache (DC) would be patched together with SharePoint and we should keep our hands off it. But that has over the time changed and allowed us to take advantage of the fixes and improvements that the AppFabric team does to the service. So, it is our responsibility to patch the Distributed Cache. But how do we do it?

[Update 2014-07-16] Here’s a link to the official statement that AppFabric/DC is independently updated from SharePoint: KB2843251

Proper patch instruction

First of all there are currently five released Cumulative Updates (CU) for AppFabric 1.1, CU1 to CU5. An AppFabric Cumulative Update is exactly the same as a CU for SharePoint, it contains all previous CUs. So if you install CU5 you also get CU1 to CU4 installed. The important thing with this is that you as an administrator should not just read the latest CU Knowledge base article, but also the previous ones (you will see one reason in a minute).

Let’s assume that we have a SharePoint farm with more than one servers running the Distributed Cache service instance. To patch these we need to download the AppFabric CU that we would like to use. I’d recommend using the latest (CU5) right now and I’ve not yet seen any issues with it, only positive effects. If you are using Windows Server 2012 you definitely should go with at least CU4.

Here’s a link to the different CU’s and their KB article respectively

In this case let’s apply CU5.

The first thing before patching that is really important to do is to properly shut down the Distributed Service instance. The reason that you would like to do this is that some items in the Distributed Cache is only living in the cache and is not backed to a persistent storage, such as some items in the SharePoint Newsfeed. If you do not properly shut down the service instance you will loose that data. Fortunately we have a great PowerShell cmdlet that does this job for us. The process here is that you need to patch one machine at a time according to these steps:

  1. Shut down the service instance on one machine
  2. Patch AppFabric 1.1
  3. Post-patch operations
  4. Start the service instance
  5. Restart from 1 on the next machine

Do not do servers in parallel unless you have a massive amount of servers and can handle that extra load/redundancy!

Step (1) is done in PowerShell using one of the built-in SharePoint cmdlets:

asnp *sharepoint*
Stop-SPDistributedCacheServiceInstance -Graceful

This command will gracefully shut down the service instance on the local machine. A graceful shutdown means that all the cache items will be distributed to the other service instances in the cache cluster. Make sure that you have enough overhead to do this. This is yet another example of my “3 is the new 2” rule is important. When you patch one server you don’t want just one extra machine! Once the service instance is stopped, I normally wait a couple of extra minutes to make sure that all the cached items has properly propagated to the other servers.

Then it is time to apply the actual AppFabric patch, step (2). Run the patch executable and follow the instructions. It’s basically a next, next, finish procedure.

Step (3). When the patch is applied you should have read through the KB articles, and if you are applying CU3 or later you should have seen that in order to improve performance you need to modify the Distributed Cache configuration file. The CU3 KB article mentions a new feature added to the AppFabric service that takes advantage of the non-blocking background garbage collection feature in .NET 4.5, which is important for machines with large amounts of RAM – and that is exactly the description of a SharePoint server. So modify the DistributedCacheService.exe.config file as below to enable the background garbage collection:

<configuration>
  ...
  <appSettings>
    <add key="backgroundGC" value="true"/>
  </appSettings>
</configuration>

The final thing we need to do on the machine is to start the service instance again, step (4). The AppFabric Windows Service will be disabled when it is shut down and you should NOT try to start that one manually, you must use the following PowerShell (or corresponding action in Central Administration if you’re a n00b).

$instance = Get-SPServiceInstance | ? {$_.TypeName -eq "Distributed Cache" -and $_.Server.Name -eq $env:computername}
$instance.Provision()

This PowerShell snippet will get the Distributed Cache service instance on the machine where it is executed and provision it and start the AppFabric Windows Service.

Once this is done and you are ready to move on to the next machine, step (5), give it a couple of extra minutes so that the newly patched (and empty) cache service instance has time to catch up.

Not patching properly…

A very common issue is that in order to apply the patch you just run the patch executable. Two things will happen when you do this. First of all the service will be shut down, but not gracefully and you will loose data. Secondly it will not start the service instance properly or at all. The patch contains a script that waits for the service to come online, but since this is not a normal AppFabric cache, it’s controlled by SharePoint, so this script will wait forever until the service comes up. If this happens all you have to do is kill the script window and start the service properly as shown above (warning: this is not a recommendation and there might be side effects)

Summary

I hope this short post cleared up some confusion on how to patch the Distributed Cache service in SharePoint 2013 and gave you an example on how to do this in a production environment without loosing any data. Of course, you should always test the procedure in a test/staging environment. Cache on!