Rack::Cache runs within each of your backend application processes and does not rely on a single intermediary process like most types of proxy cache implementations. Because of this, the storage subsystem has implications on not only where cache data is stored but whether the cache is properly distributed between multiple backend processes. It is highly recommended that you read and understand the following before choosing a storage implementation.
Rack::Cache stores cache entries in two separate configurable storage areas: a MetaStore and an EntityStore.
The MetaStore keeps high level information about each cache entry, including the request/response headers and other status information. When a request is received, the core caching logic uses this meta information to determine whether a fresh cache entry exists that can satisfy the request.
The EntityStore is where the actual response body content is stored. When a response is entered into the cache, a SHA1 digest of the response body content is calculated and used as a key. The entries stored in the MetaStore reference their response bodies using this SHA1 key.
Separating request/response meta-data from response content has a few important advantages:
Different storage types can be used for meta and entity storage. For example, it may be desirable to use memcached to store meta information while using the filesystem for entity storage.
Cache entry meta-data may be retrieved quickly without also retrieving response bodies. This avoids significant overhead when the cache misses or only requires validation.
Multiple different responses may include the same exact response body. In these cases, the actual body content is stored once and referenced from each of the meta store entries.
You should consider how the meta and entity stores differ when choosing a storage
implementation. The MetaStore does not require nearly as much memory as the
EntityStore and is accessed much more frequently. The EntityStore can grow quite
large and raw performance is less of a concern. Using a memory based storage
implementation (heap
or memcached
) for the MetaStore is strongly advised,
while a disk based storage implementation (file
) is often satisfactory for
the EntityStore and uses much less memory.
The MetaStore and EntityStore used for a particular request is determined by
inspecting the rack-cache.metastore
and rack-cache.entitystore
Rack env
variables. The value of these variables is a URI that identifies the storage
type and location (URI formats are documented in the following section).
The heap:/
storage is assumed if either storage type is not explicitly
provided. This storage type has significant drawbacks for most types of
deployments so explicit configuration is advised.
The default metastore and entitystore values can be specified when the Rack::Cache object is added to the Rack middleware pipeline as follows:
use Rack::Cache,
:metastore => 'file:/var/cache/rack/meta',
:entitystore => 'file:/var/cache/rack/body'
Alternatively, the rack-cache.metastore
and rack-cache.entitystore
variables may be set in the Rack environment by an upstream component.
Rack::Cache includes meta and entity storage implementations backed by local process memory ("heap storage"), the file system ("disk storage"), and memcached. This section includes information on configuring Rack::Cache to use a specific storage implementation as well as pros and cons of each.
Uses local process memory to store cached entries.
use Rack::Cache,
:metastore => 'heap:/',
:entitystore => 'heap:/'
The heap storage backend is simple, fast, and mostly useless. All cache information is stored in each backend application's local process memory (using a normal Hash, in fact), which means that data cached under one backend is invisible to all other backends. This leads to low cache hit rates and excessive memory use, the magnitude of which is a function of the number of backends in use. Further, the heap storage provides no mechanism for purging unused entries so memory use is guaranteed to exceed that available, given enough time and utilization.
Use of heap storage is recommended only for testing purposes or for very simple/single-backend deployment scenarios where the number of resources served is small and well understood.
Stores cached entries on the filesystem.
use Rack::Cache,
:metastore => 'file:/var/cache/rack/meta',
:entitystore => 'file:/var/cache/rack/body'
The URI may specify an absolute, relative, or home-rooted path:
file:/storage/path
- absolute path to storage directory.file:storage/path
- relative path to storage directory, rooted at the
process's current working directory (Dir.pwd
).file:~user/storage/path
- path to storage directory, rooted at the
specified user's home directory.file:~/storage/path
- path to storage directory, rooted at the current
user's home directory.File system storage is simple, requires no special daemons or libraries, has a tiny memory footprint, and allows multiple backends to share a single cache; it is one of the slower storage implementations, however. Its use is recommended in cases where memory is limited or in environments where more complex storage backends (i.e., memcached) are not available. In many cases, it may be acceptable (and even optimal) to use file system storage for the entitystore and a more performant storage implementation (i.e. memcached) for the metastore.
NOTE: When both the metastore and entitystore are configured to use file system storage, they should be set to different paths to prevent any chance of collision.
Stores cached entries in a remote memcached instance.
use Rack::Cache,
:metastore => 'memcached://localhost:11211/meta',
:entitystore => 'memcached://localhost:11211/body'
The URI must specify the host and port of a remote memcached daemon. The path portion is an optional (but recommended) namespace that is prepended to each cache key.
The memcached storage backend requires either the dalli
or memcached
libraries. By default, the dalli
library is used; require the memcached
library explicitly to use it instead.
gem install dalli
Memcached storage is reasonably fast and allows multiple backends to share a single cache. It is also the only storage implementation that allows the cache to reside somewhere other than the local machine. The memcached daemon stores all data in local process memory so using it for the entitystore can result in heavy memory usage. It is by far the best option for the metastore in deployments with multiple backend application processes since it allows the cache to be properly distributed and provides fast access to the meta-information required to perform cache logic. Memcached is considerably more complex than the other storage implementations, requiring a separate daemon process and extra libraries. Still, its use is recommended in all cases where you can get away with it.