In this article I would like to give you some information about a very important timer job in SharePoint 2010 - the Application Addresses Refresh Job. If you do not understand what it is used for you might see some strange (to you) error messages when configuring SharePoint. Even if you’re familiar with it it might be a good idea to continue reading.

Purpose of the Application Addresses Refresh Job

The Application Addresses Refresh Job has one specific job to do - keep track of all available and online instances of all service application end-points. This means that whenever a proxy requests an endpoint for a service application it will ask the Topology Service (the Application Discovery and Load Balancer Service) for an endpoint. The Topology Service keeps a list of the endpoints that has been discovered by the Application Addresses Refresh Job and passes on one of these endpoints to the proxy, using the load balancing algorithm, which uses that endpoint to talk to the service application. So far so good…

So, what could go wrong here…

SNAGHTML7f12719The problem is that this job only runs (by default) every 15 minutes. And unless you follow the first rule of Spence - “Step away from the keyboard”, you will experience some interesting side effects.

Service Application configuration

One of the first times you’ll experience this 15 minute delays is when creating Service Applications in SharePoint 2010. Let’s take the Secure Store as an example. You create the Secure Store Service Application and trigger happy as you are you click on it to configure the Secure Store Key. And most of the times you will see an error like this:

Cannot complete this action as the Secure Store Shared Service is not responding. Please contact your administrator.

You hit the reload button a couple of times, starts to fiddle with permissions but nothing happens. Finally you realize - ahh, I haven’t started the Service Instance of the Secure Store, so you start that and head back to the Secure Store Service App to continue to configure it. But, you still receive the same error message. You do some more fiddling with permissions etc until your totally lost in your configuration madness. You do some Binging on the Interwebs and suddenly it just works…

What really happened was that the Application Addresses Refresh Job run, meanwhile you were furiously blaming the product group for a crappy product, and found a valid and working endpoint for the Secure Store Service App. And now the Topology Service are aware of the endpoint and can pass it on to the proxy.

What you really should have done is; first start the Service Instance, then create the Service Application.  And if you still get the error message, manually kicking off the timer job will do the trick.

Farm maintenance

Another common scenario where similar results may be seen is when you do some farm reconfiguring; such as adding/removing/rebooting servers, moving Service Application Instances from one server to another (stop on one server and start on another). You could do this while your farm is hot and running but make sure to start the timer job whenever you do a change (start/stop and instance or add/remove a server). Worst case your end-users will be unable to use the Service Application for at the most 15 minutes. One scenario where I’ve seen it happen is when you take a server out of the load balancer rotation to do Windows patching and then you need to reboot that server - the service application will be unavailable for that time on that machine (duh!). So if you have for instance three servers running this service instance, every third (Round Robin) request will fail. Running the timer job immediately after starting the re-boot sequence will mitigate any errors.

Should I change the Timer Job schedule?

Well, this is totally up to you. From what I’ve seen it’s not a “heavy” job and you could lower the interval. But under normal circumstances 15 minutes should do the trick. But when doing maintenance, as discussed above, lowering the interval might be a good idea.

Summary

A short, and pretty intuitive, post about a very, very important Timer Job in SharePoint 2010 - the Application Addresses Refresh Job. Make sure that this job is running and behaving - otherwise your end-users (and proxies) will not be able to talk to the service application instances.

[Update 2012-05-20] If you are interested in more details on the topology web service and the service application load balancing I recommend that you read the following post by Josh Gavant: How I learned to Stop Worrying and Love the SharePoint Topology Service.