Apache Jackrabbit : CacheManager

The CacheManager manages the size of the caches used in Jackrabbit (MLRUItem{{`State}}`Cache objects). The combined size of all caches must be limited to avoid out of memory problems. Without CacheManager, Jackrabbit can run out of memory because the the combined size of the various caches is not managed. This mechanism is not the definitive solution; it may be desirable to internally use only one cache.

The CacheManager does not control the memory used by unsaved data (data in the transient space). If you get out of memory exceptions, check that your application calls Node.save() or (even better) Session.save() from time to time. Unsaved changes are kept in memory with the current Jackrabbit.

The maximum size for all caches in CacheManager is 16 megabytes by default, but it can be changed like this:

Session session = new TransientRepository()
                .login(new SimpleCredentials("", new char[0]));
RepositoryImpl repository = (RepositoryImpl)session.getRepository();
CacheManager manager = ((RepositoryImpl) repository).getCacheManager();
manager.setMaxMemory(8 * 1024 * 1024); // default is 16 * 1024 * 1024
manager.setMaxMemoryPerCache(1024 * 1024); // default is 4 * 1024 * 1024
manager.setMinMemoryPerCache(64 * 1024); // default is 128 * 1024]

The index resizing algorithm works like this:

The available memory is dynamically distributed across the caches each second at most. CacheManager tries to calculates the best cache sizes by comparing the access counts of each cache, and the used memory. The idea is, the more a cache is accessed, the more memory it should get, while the cache should not shrink too quickly. A minimum and maximum size per cache is defined as well. After distributing the memory in this way, there might be some unused memory (if one or more caches did not use some of the allocated memory). This unused memory is distributed evenly across the full caches.

See also: http://issues.apache.org/jira/browse/JCR-619