Reputation: 44663
My application is using AppFabric for our distributed caching model in a production web farm of windows web 5 servers. The application is a .net4 c# web application. We are encountering some problems with AppFabric and have some questions regarding the setup of such. The main issue we have is if one of the web 5 servers is restarted, the site on the other servers will also go down for a short period of time with appfabric exceptions like the following appearing in our event logs:
We have a cache provider wrapper class that creates the datacachefactory object etc and is used as the intermediatory between the web application and appfabric. This is a singleton class so only one instance of the datacachefactory object is created on the Init of the class.
The second error above I believe I have found the reason for, in our code the region was being created on the Init ie at the very start, but if a node comes out of the cluster that contains the region in its memorary, then the above error is a result. To resolve this issue, the region should be attempted to be created on every request appfabric - but only creating it if it does not exist - does this sound correct?
Regarding the other error, I believe it may be down to the configruation. This is the cluster config xml file:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<configSections>
<section name="dataCache" type="Microsoft.ApplicationServer.Caching.DataCacheSection, Microsoft.ApplicationServer.Caching.Core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />
</configSections>
<dataCache size="Small">
<caches>
<cache consistency="StrongConsistency" name="App1Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="App2Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="App3Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="default">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
</caches>
<hosts>
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="724664608" size="1228" leadHost="true" account="SERVER1\user"
cacheHostName="AppFabricCachingService" name="SERVER1"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="598646137" size="1228" leadHost="true" account="SERVER2\user"
cacheHostName="AppFabricCachingService" name="SERVER2"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="358039700" size="1228" leadHost="true" account="SERVER3\user"
cacheHostName="AppFabricCachingService" name="SERVER3"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="929915039" size="1228" leadHost="false" account="SERVER4\user"
cacheHostName="AppFabricCachingService" name="SERVER4"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="1752630351" size="1228" leadHost="false" account="SERVER5\user"
cacheHostName="AppFabricCachingService" name="SERVER5"
cachePort="22233" />
</hosts>
<advancedProperties>
<securityProperties>
<authorization>
<allow users="everyone" />
</authorization>
</securityProperties>
</advancedProperties>
</dataCache>
</configuration>
Note: we have multiple we caches set up as we have multiple applications using appfabric, and seeing same issues with them all.
And this is the web.config entry in the application on each of the servers:
<dataCacheClient requestTimeout="15000" channelOpenTimeout="3000" maxConnectionsToServer="1">
<localCache isEnabled="true" sync="TimeoutBased" ttlValue="300" objectCount="10000" />
<clientNotification pollInterval="300" maxQueueLength="10000" />
<hosts>
<host name="SERVER1" cachePort="22233" />
<host name="SERVER2" cachePort="22233" />
<host name="SERVER3" cachePort="22233" />
<host name="SERVER4" cachePort="22233" />
<host name="SERVER5" cachePort="22233" />
</hosts>
<transportProperties connectionBufferSize="131072" maxBufferPoolSize="268435456" maxBufferSize="8388608" maxOutputDelay="2" channelInitializationTimeout="60000" receiveTimeout="600000" /></dataCacheClient>
Anyone see a problem with the above? As you can see we have 3 lead hosts and 2 secondaries.
Some questions I have following on from this are:
When we are restarting the servers, we attempt to never restart the lead hosts at the same time.
Thanks for any feedback on this!
Upvotes: 1
Views: 1363
Reputation: 171
We make extensive use of AppFabric caching. You are going to see the
Message: ErrorCode:SubStatus:There is a temporary failure. Please retry later.
fairly often. It's probably best to write yourself a wrapper around AppFabric that automates retries when this error is thrown. You really want to use exponential backoff, but failing that randomizing the retry period may be enough.
The cache configuration in the Web.config file is only used to create the cache factory. It will contact one of the hosts and obtain the cluster configuration from that. The only benefit to listing all hosts in your Web.config is so that if a host is down it can contact another host. Even if you only listed a single host, provided that was present your caching would work fine.
Using a local cache is likely to improve performance if you read objects more frequently than you write them. You're going to have to tune the size of that by experimentation.
Upvotes: 1