7.2Caching within cmsWorks: Overview
This chapter gives an overview of the caching methods and how to configure them to enhance the overall performance and memory usage of the system.
In cmsWorks, multiple caching methods on multiple different calling layers are implemented to speed up the generation-of-content process. Caching hereby means that information is read from the database and stored in the application server as plain old java objects (POJOs).
This way, database accesses are minimized optimizing database and cmsWorks throughput performance.
LRU-cache for dynamic caches
Caching is mostly performed by using a LRU (last-recently-used) caching method. So in case the cache quota is exceeded, the least used fragment of the cache is dumped to grant space for a new fragment not already cached. (Re)Calling an already cached fragment will result in re-queuing that fragment to the first place in the cache rather than dumping it.
List of cached data
At first, find here a list of data that is cached in the server rather than always re-read from the database.
Name | Pre-filled | Description |
Resource Type | Yes | Holding the document types of the system. |
Mime Type | Yes | Complete list of mime types the system can handle. |
Resource Name | No | Simple cache to map the name of a resource to its ID for faster access. |
Additional Param | Yes | Keeping additional parameters in memory. |
Folder | Yes | Holding all (content) folders, that describe the content structure wherein resources (documents) are in. |
Published resource | No | (Mostly) Small cache for published resources. |
Resources of Folder | No | Keeping a list of IDs of resources (documents) of one folder. |
User Group | Yes | Caching the user groups. |
Blob | No | Caching media-data (binary large objects). |
Resource | No | The cache consisting of already (re)called resources (documents). |
User | Yes | Caching the user data. |
Pre-filled caches and Resource caches
The caches in cmsWorks roughly divide into two types of caches: "Pre-filled caches" and "Resource caches".
Pre-filled caches
Pre-filled caches are internal caches that need no administration. Rather these caches are filled by the system at starting time and maintained by the system itself. They are normally not related to content stored in the system (with the exception of the Folder cache - this cache always holds all folders of the content structure).
Resource caches
Resource caches are caches that store (meta) information about content or even store the content itself. Following a description what the caches store and how they work in detail.
Name | Description |
Resource Name | To enhance the speed of resource name lookups, this cache simply returns the names of resources to a given ID. The "Resource Name" cache should be (at least) twice as high as the "Resource" cache or triple as high as the "Folder" cache settings. This cache has a low memory footprint per entry. |
Resources of Folder | When looking up all resources stored in a folder, this cache is used to bypass costly database queries. The cache stores all IDs of resources for a given folder. This cache has a low memory footprint per entry. One entry of one folder consists of int-values for the resource-IDs stored in an int array, so it's size is restricted by the overall count of resources and folders. This cache is not limited in size. |
Blob | The blob cache only stores the contents of the media document field. Gathering blobs is expensive in two ways: Getting it from the database is costly and storing it in a cache needs much memory (through its normally bigger size). By dissolving the blob resource field from the other resource fields of the resources in the resource cache, the cache acts more granulated (i.e. fetching a resource to read certain resource fields does not invoke the read of blob field if not needed). This cache has a high memory footprint per entry. Therefore it can be limited in two ways: Quantity-sized and/or memory-sized. Reaching the limit of one of these two sizes and calling a not cached blob means that the oldest cache entry is dumped in favor for the new not-yet cached entry. |
Resource | This cache stores the resources (meta data and the belonging fields without the blob fields). The resource cache only stores the last version of a resource, regardless if it is published or unpublished (i.e. the resource is "work in progress"). Blobs are separately stored in the Blob cache. In case a blob is requested from a resource, the resource cache reads it (transparently) from the blob cache and returns it. This cache has a medium memory footprint per entry, depending on the quantity and type of the fields the resources hold. |
Published resource | The Published resource cache acts alike the Resource cache with one exception: In case the last version of a resource in the Resource cache is not in state "published", the cache would have to do an expensive database lookup to get the last published resource. Therefore the Published resource cache was created to intermediately store the last published resource and avoid the lookup. This cache has a medium memory footprint per entry but through its nature (called only if a requested resource is not in state published in the Resource cache) it can be kept relatively small. A size of 5000 entries will suffice in most cases. |
Configuring cache sizes
Using the telnet server, the parameters for caches can be reset in a life working environment. But be aware that if the server is restarted the values of these parameters will be overwritten by the parameter values stored in the cms.properties - file.
The cms.properties - file contains the initial (after start) values for the different caches. The caching parameters reside in the domain of the CMSCore - service, so their properties-prefix is "/com/itechworks/develop/topas/services/cms/server/CMSCore". The parameters and a description can be found here:
Parameter | Description |
ResourceCacheSize | The maximum size of the Resource cache. This is a number starting at "1". |
ResourceCacheTypeLimits | Every resource type can get a limit that determines how much resources of one resource type is stored at maximum in the cache. Syntax of the value String is "5:10000;6:2000" which would mean that the quota for resources of resource type "5" is 10000, and the quota of resource type "6" is 2000. |
PublishedResourcesCacheSize | Similar to the parameter "ResourceCacheSize". |
PublishedResourceCacheTypeLimits | Works like the "ResourceCacheTypeLimits" and should not be used, though this cache is mostly managed through request of the Resource cache. |
ResourceNameCacheSize | The maximum size of the Resource name cache. This is a number starting at "1". |
BlobCacheSize | The maximum size of the Blob cache. This is a number starting at "1". |
BlobCacheMemory | The maximum memory used by the Blob cache. This is a number starting at "1" and represents "megabyte" (so a value of "100" will use at most 100 megabytes for the blob cache) |
Monitoring and optimizing caches
The Pre-filled caches can only show off their size, due to their nature that they are always loaded completely into the memory. Using the telnet server, most Resource caches can be monitored by showing the configured size and the fill level of each cache.
Anyway, the Resource cache and the Published resource cache can list the resource types of cached elements containing their quantity separately, a caching ratio and a histogram (a simple time line).
The blob cache shows the limits of quantity and/or memory-size, if configured with either a quantity or a memory-size (or both of them). Additionally, a caching ratio can be reviewed.
Starting to monitor the caches, you can log in to the cmsWorks telnet server and enter the command "cmscaches CMSCore".
Resource cache / Published resource cache
The resource caches are important for the optimization of runtime-behavior of cmsWorks. A resource itself consists of the resource data and the resource fields. Thus, each resource triggers several database queries (one for the resource, the others for the fields of the resource).
This way, a request for a resource that provokes a cache-miss and calls the database is thousands of times more expensive than requesting a resource already cached. So optimizing the resource cache will greatly improve the overall performance of cmsWorks.
Both resource caches ("Resource cache" and "Published resource cache") implemented in cmsWorks are LRU-caches of similar type in administration. They store further meta data about how the caches perform. These meta data are:
- The maximum size and the fill level of the caches
- The distinct number of resources in the cache grouped by their resource type
- The cache ratio including hits/misses grouped by their resource type (including the overall cache ratio)
- A histogram showing the ages of resources in the LRU-cache grouped by time intervals
You can access these meta data via the telnet server. The command to do so is called "cmscaches" and allows various parameters.
Having these data it is possible to optimize the size of the caches and restrictions on singular resource types. It is possible to
- Enhance or reduce the maximum quota (total size) of the resource cache
- Limit every resource type with a maximum quota of allowed resources in the cache
Increasing the size of the resource cache helps, of course, but it uses more memory, too. That is not always an option (see "Optimizing JVM memory" and "Selecting the right Garbage Collector"). The cache will consume memory and, at least, expand the time a garbage collection needs to process a stop-the-world collection. This may result in bad responsiveness of the content management system.
Blob cache
The blob cache also is a LRU-cache that stores all media types in cmsWorks. Blobs can consist of any type of stored media, be it pictures, PDFs or word documents. Even raw text can be stored in blobs. Hence it is most probably the cache which needs the most memory per cache item.
And, reading binary large objects (blobs) from a database normally is very costly for both, the application and the database.
Having such an impact on database performance and the overall memory of the cmsWorks server, the blob cache can be configured in three ways:
- Limiting the quantity of cached blobs by assigning a maximum count to the cache
- Limiting the memory-size of cached blobs by assigning a maximum memory size (in megabytes) to the cache
- Assigning both, the quantity limit and the memory-size limit to the cache
In the latter case, the first limit that is reached will limit the cache. If a cache is configured to hold 10,000 blobs (quantity) and its memory-size is limited to 100m (100 megabyte), then the cache is full if either 10,000 blobs are reached or the total size of the blobs exceeds 100 megabyte.
