### Get DiskCache Source Code Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Provides commands to clone the DiskCache repository from GitHub or download the source code as a tarball or zipball. Once obtained, the package can be installed using setup.py. ```bash $ git clone https://github.com/grantjenks/python-diskcache.git ``` ```bash $ curl -OL https://github.com/grantjenks/python-diskcache/tarball/master ``` ```bash $ curl -OL https://github.com/grantjenks/python-diskcache/zipball/master ``` ```bash $ python setup.py install ``` -------------------------------- ### Install DiskCache Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Installs the latest version of DiskCache from PyPI using pip. It's recommended to pin at least the major version for production environments. ```bash $ pip install --upgrade diskcache ``` -------------------------------- ### DjangoCache Configuration Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Provides an example of configuring DjangoCache in Django's settings file, specifying backend, location, timeout, shards, and other options for a sharded cache setup. ```python CACHES = { 'default': { 'BACKEND': 'diskcache.DjangoCache', 'LOCATION': '/path/to/cache/directory', 'TIMEOUT': 300, 'SHARDS': 8, 'DATABASE_TIMEOUT': 0.010, # 10 milliseconds 'OPTIONS': { 'size_limit': 2 ** 30 # 1 gigabyte }, }, } ``` -------------------------------- ### DiskCache Installation and Basic Usage Source: https://github.com/grantjenks/python-diskcache/blob/master/README.rst This section provides instructions on how to install DiskCache using pip and demonstrates basic usage by importing the Cache, FanoutCache, and DjangoCache classes. It also shows how to access documentation for these classes. ```python $ pip install diskcache ``` ```python import diskcache help(diskcache) ``` ```python from diskcache import Cache, FanoutCache, DjangoCache help(Cache) help(FanoutCache) help(DjangoCache) ``` ```python from diskcache import Deque, Index help(Deque) help(Index) ``` -------------------------------- ### Cache Initialization with Eviction Policy Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates how to initialize a Cache object and set its eviction policy. It shows the default policy and how to change it to 'least-frequently-used'. The example also includes resetting the policy. ```python from diskcache import Cache cache = Cache() print(cache.eviction_policy) cache = Cache(eviction_policy='least-frequently-used') print(cache.eviction_policy) print(cache.reset('eviction_policy', 'least-recently-used')) ``` -------------------------------- ### Asynchronous Operation Example Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Illustrates how to run DiskCache methods asynchronously using Python's asyncio and a thread-pool executor. This example shows a basic structure for an async function that interacts with the cache. ```python import asyncio async def set_async(key, val): loop = asyncio.get_running_loop() # Further implementation would involve running cache operations in a thread pool pass ``` -------------------------------- ### Install Development Dependencies Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/development.rst Installs all necessary development dependencies for DiskCache using pip. ```bash $ pip install -r requirements.txt ``` -------------------------------- ### DiskCache Cache Class Initialization and Usage Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates the basic usage of the diskcache.Cache class, including initialization, setting and getting items using dictionary-like syntax, checking for key existence, and deleting keys. It also shows how to use the Cache object within a 'with' statement for proper closing. ```python from diskcache import Cache # Initialize cache (uses a temporary directory if none specified) cache = Cache() # Using 'with' statement ensures cache.close() is called with Cache(cache.directory) as reference: reference.set('key', 'value') # Cache operations are atomic and thread/process-safe cache['key'] = 'value' print(cache['key']) print('key' in cache) del cache['key'] # Cache objects can be left open as operations are atomic # Closing and then accessing will automatically reopen (slower) cache.close() print(cache.get('key')) ``` -------------------------------- ### DiskCache Cache Set and Get with Advanced Options Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Illustrates advanced usage of the Cache.set() and Cache.get() methods, including setting expiration times, reading values as file-like objects, and associating tags with cache entries. The get() method can also retrieve expiration time and tag metadata. ```python from io import BytesIO # Set item with expiration, read flag, and tag cache.set('key', BytesIO(b'value'), expire=5, read=True, tag='data') # Get item with read, expire_time, and tag flags result = cache.get('key', read=True, expire_time=True, tag=True) reader, timestamp, tag = result print(reader.read().decode()) print(type(timestamp).__name__) print(tag) ``` -------------------------------- ### Cache Volume and Statistics Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Shows how to get the estimated disk usage of the cache using `volume()` and how to enable, retrieve, and reset cache hit/miss statistics using `stats()`. Statistics are useful for evaluating eviction policies but incur overhead. ```python >>> cache.volume() < int(1e5) True >>> cache.stats(enable=True) (0, 0) >>> for num in range(100): ... _ = cache.set(num, num) >>> for num in range(150): ... _ = cache.get(num) >>> hits, misses = cache.stats(enable=False, reset=True) >>> (hits, misses) (100, 50) ``` -------------------------------- ### Closing and Deleting Cache Directory Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Provides an example of how to properly close a cache and then manually delete its directory on disk using `shutil.rmtree`. This is necessary because caches are persistent and do not auto-delete. ```python >>> cache.close() >>> import shutil >>> try: ... shutil.rmtree(cache.directory) ... except OSError: # Windows wonkiness ... pass ``` -------------------------------- ### Asyncio Cache Set Example Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates how to use the diskcache library with asyncio to set a key-value pair in the cache asynchronously. It utilizes `run_in_executor` to run the synchronous `cache.set` operation in a separate thread, allowing the asyncio event loop to continue processing other tasks. ```python import asyncio import diskcache cache = diskcache.Cache("my_cache_dir") def set_async(key, val): future = loop.run_in_executor(None, cache.set, key, val) result = await future return result asyncio.run(set_async('test-key', 'test-value')) ``` -------------------------------- ### pylibmc Client Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p8.txt Performance metrics for the pylibmc client. This section details the timings for get, set, and delete operations, including counts, misses, and latency distributions. ```benchmark Timings for pylibmc.Client ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712612 70517 95.844us 113.010us 131.130us 604.153us 69.024s set 71464 0 97.036us 114.918us 136.137us 608.921us 7.024s delete 7916 817 94.891us 112.057us 132.084us 604.153us 760.844ms Total 791992 76.809s ``` -------------------------------- ### DiskCache Performance Benchmarks Source: https://github.com/grantjenks/python-diskcache/blob/master/README.rst Rough measurements comparing the performance of DiskCache against other libraries like dbm, shelve, sqlitedict, and pickleDB for get, set, and delete operations. These are not rigorous data. ```APIDOC DiskCache Performance Benchmarks: ================ ============= ========= ========= ============ ============ Project diskcache dbm shelve sqlitedict pickleDB ================ ============= ========= ========= ============ ============ get 25 µs 36 µs 41 µs 513 µs 92 µs set 198 µs 900 µs 928 µs 697 µs 1,020 µs delete 248 µs 740 µs 702 µs 1,717 µs 1,020 µs ================ ============= ========= ========= ============ ============ Note: These are rough measurements. See `DiskCache Cache Benchmarks`_ for more rigorous data. ``` -------------------------------- ### Custom JSONDisk Implementation Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Provides a custom implementation of diskcache.Disk that uses compressed JSON for serialization and deserialization. This class overrides put, get, store, and fetch methods to handle JSON encoding/decoding and zlib compression. ```python import json import zlib import diskcache from diskcache import UNKNOWN class JSONDisk(diskcache.Disk): def __init__(self, directory, compress_level=1, **kwargs): self.compress_level = compress_level super().__init__(directory, **kwargs) def put(self, key): json_bytes = json.dumps(key).encode('utf-8') data = zlib.compress(json_bytes, self.compress_level) return super().put(data) def get(self, key, raw): data = super().get(key, raw) return json.loads(zlib.decompress(data).decode('utf-8')) def store(self, value, read, key=UNKNOWN): if not read: json_bytes = json.dumps(value).encode('utf-8') value = zlib.compress(json_bytes, self.compress_level) return super().store(value, read, key=key) def fetch(self, mode, filename, value, read): data = super().fetch(mode, filename, value, read) if not read: data = json.loads(zlib.decompress(data).decode('utf-8')) return data with Cache(disk=JSONDisk, disk_compress_level=6) as cache: pass ``` -------------------------------- ### redis.StrictRedis Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p8.txt Performance metrics for the redis.StrictRedis client. This table outlines the timings for get, set, and delete operations, showing counts, misses, and various latency percentiles. ```benchmark Timings for redis.StrictRedis ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712612 70540 187.874us 244.141us 305.891us 1.416ms 138.516s set 71464 0 192.881us 249.147us 311.136us 1.363ms 14.246s delete 7916 825 185.966us 242.949us 305.176us 519.276us 1.525s Total 791992 154.287s ``` -------------------------------- ### DiskCache Transactions Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Provides examples of using transactions with diskcache objects (Cache, Deque, Index) for atomic operations. It shows how to increment values within a transaction and calculate a running average. It also demonstrates grouping multiple write operations within a single transaction for performance improvement and discusses nested transactions. ```python with cache.transact(): total = cache.incr('total', 123.45) count = cache.incr('count') with cache.transact(): total = cache.get('total') count = cache.get('count') average = None if count == 0 else total / count def set_many(cache, mapping): with cache.transact(): for key, value in mapping.items(): cache[key] = value ``` -------------------------------- ### DiskCache Settings and Initialization Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates how to initialize a DiskCache with custom settings like size_limit and how to reset specific cache parameters. It also shows basic cache operations like setting a key-value pair and accessing cache statistics. ```python from diskcache import Cache cache = Cache(size_limit=int(4e9)) print(cache.size_limit) print(cache.disk_min_file_size) cache.reset('cull_limit', 0) # Disable automatic evictions. print(cache.reset('cull_limit')) cache.set(b'key', 1.234) print(cache.count) # Stale attribute. cache.reset('count') # Prefer: len(cache) print(len(cache)) ``` -------------------------------- ### Run Tests with setup.py Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/development.rst Executes the DiskCache tests using the setup.py script, which downloads a minimal testing infrastructure. ```bash $ python setup.py test ``` -------------------------------- ### DiskCache Configuration Options Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Lists and describes various configuration options for DiskCache, including disk-related settings like minimum file size and pickle protocol, as well as SQLite pragmas for performance tuning. ```APIDOC Settings: size_limit: The maximum on-disk size of the cache (default: one gigabyte). cull_limit: The maximum number of keys to cull when adding a new item (default: ten). Set to zero to disable automatic culling. statistics: Collect cache statistics (default: False). tag_index: Create a database tag index for evict (default: False). eviction_policy: Determines the eviction policy (default: "least-recently-stored"). disk_min_file_size: The minimum size to store a value in a file (default: 32 kilobytes). disk_pickle_protocol: The Pickle protocol to use for data types not natively supported (default: highest Pickle protocol). sqlite_auto_vacuum: SQLite auto_vacuum setting (default: 1, "FULL"). sqlite_cache_size: SQLite cache_size setting (default: 8,192 pages). sqlite_journal_mode: SQLite journal_mode setting (default: "wal"). sqlite_mmap_size: SQLite mmap_size setting (default: 64 megabytes). sqlite_synchronous: SQLite synchronous setting (default: 1, "NORMAL"). ``` -------------------------------- ### DiskCache Core Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/cache-benchmarks.rst This table shows the performance metrics for the core diskcache.Cache operations: get, set, and delete. It includes counts, miss rates, median, 90th, 99th, and maximum latencies, as well as total time. The workload consists of approximately 1% cache misses, primarily from gets after deletes, with no item expiry. ```python ========= ========= ========= ========= ========= ========= ========= ========= Timings for diskcache.Cache ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 88966 9705 12.159us 17.166us 28.849us 174.999us 1.206s set 9021 0 68.903us 93.937us 188.112us 10.297ms 875.907ms delete 1012 104 47.207us 66.042us 128.031us 7.160ms 89.599ms Total 98999 2.171s ========= ========= ========= ========= ========= ========= ========= ========= ``` -------------------------------- ### DiskCache Performance Metrics Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/cache-benchmarks.rst Performance statistics for DiskCache operations including get, set, and delete. Shows counts, misses, and latency percentiles. ```APIDOC DiskCache Performance: Action: Operation type (get, set, delete, Total) Count: Number of operations performed Miss: Number of cache misses Median: Median latency of the operation P90: 90th percentile latency P99: 99th percentile latency Max: Maximum latency observed Total: Total time spent on the operation ``` -------------------------------- ### DiskCache Tag Index Management Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates how to manage the tag index for accelerating tag-based eviction. Includes creating, dropping, and checking the status of the tag index. ```python >>> cache.drop_tag_index() >>> cache.tag_index 0 >>> cache.create_tag_index() >>> cache.tag_index 1 ``` -------------------------------- ### pylibmc.Client Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p1.txt Performance metrics for the pylibmc.Client, a Python interface for libmemcached. Shows operation counts, misses, and latency for get, set, and delete operations. ```APIDOC pylibmc.Client: get: Count: 89115 Miss: 8714 Median: 42.915us P90: 62.227us P99: 79.155us Max: 166.178us Total: 3.826s set: Count: 8941 Miss: 0 Median: 44.107us P90: 63.896us P99: 82.254us Max: 121.832us Total: 396.247ms delete: Count: 943 Miss: 111 Median: 41.962us P90: 60.797us P99: 75.817us Max: 92.983us Total: 39.570ms Total Operations: 98999 Total Time: 4.262s ``` -------------------------------- ### Basic Usage Comparison: pylibmc vs DiskCache Source: https://github.com/grantjenks/python-diskcache/blob/master/README.rst This snippet demonstrates a performance comparison between pylibmc and DiskCache for basic key-value operations. It shows the time taken for setting and retrieving a key using both libraries, highlighting DiskCache's efficiency. ```python import pylibmc client = pylibmc.Client(['127.0.0.1'], binary=True) client[b'key'] = b'value' %timeit client[b'key'] ``` ```python import diskcache as dc cache = dc.Cache('tmp') cache[b'key'] = b'value' %timeit cache[b'key'] ``` -------------------------------- ### diskcache.Cache Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p1.txt Performance metrics for the standard diskcache.Cache implementation. Shows operation counts, misses, and latency percentiles for get, set, and delete operations. ```APIDOC diskcache.Cache: get: Count: 89115 Miss: 8714 Median: 19.073us P90: 25.749us P99: 32.902us Max: 115.395us Total: 1.800s set: Count: 8941 Miss: 0 Median: 114.918us P90: 137.091us P99: 241.041us Max: 4.946ms Total: 1.242s delete: Count: 943 Miss: 111 Median: 87.976us P90: 149.202us P99: 219.824us Max: 4.795ms Total: 120.738ms Total Operations: 98999 Total Time: 3.163s ``` -------------------------------- ### DiskCache Deque Usage Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates the usage of the diskcache.Deque class, a double-ended queue compatible with collections.deque. It shows initialization, pop, popleft, appendleft, length checking, directory access, and creating a new deque from an existing directory. It also illustrates using maxlen for a bounded deque. ```python from diskcache import Deque deque = Deque(range(5, 10)) deque.pop() deque.popleft() deque.appendleft('foo') len(deque) type(deque.directory).__name__ other = Deque(directory=deque.directory) len(other) other.popleft() thing = Deque('abcde', maxlen=3) list(thing) ``` -------------------------------- ### DiskCache Clearing and Eviction Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Illustrates methods for clearing the cache, resetting limits, and expiring or evicting items. Covers both immediate expiration and tag-based eviction. ```python >>> cache.clear() 3 >>> cache.reset('cull_limit', 0) # Disable automatic evictions. 0 >>> for num in range(10): ... _ = cache.set(num, num, expire=1e-9) # Expire immediately. >>> len(cache) 10 >>> list(cache) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> import time >>> time.sleep(1) >>> cache.expire() 10 >>> cache.evict('even') 50 >>> cache.clear() 50 >>> for num in range(100): ... _ = cache.set(num, num, tag=(num % 2)) >>> cache.evict(0) 50 >>> cache.clear() 50 >>> cache.clear() > 0 True ``` -------------------------------- ### redis.StrictRedis Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p1.txt Performance metrics for the redis.StrictRedis client, a Python interface for Redis. Details operation counts, misses, and latency for get, set, and delete operations. ```APIDOC redis.StrictRedis: get: Count: 89115 Miss: 8714 Median: 86.069us P90: 101.089us P99: 144.005us Max: 805.140us Total: 7.722s set: Count: 8941 Miss: 0 Median: 89.169us P90: 104.189us P99: 146.866us Max: 408.173us Total: 800.963ms delete: Count: 943 Miss: 111 Median: 86.069us P90: 99.182us P99: 149.012us Max: 327.826us Total: 80.976ms Total Operations: 98999 Total Time: 8.604s ``` -------------------------------- ### diskcache.Cache Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p8.txt Performance metrics for the basic diskcache.Cache implementation. It includes counts, misses, and latency percentiles for get, set, and delete operations. ```benchmark Timings for diskcache.Cache ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712612 69147 20.027us 28.133us 45.061us 2.792ms 15.838s set 71464 0 129.700us 1.388ms 35.831ms 1.342s 160.708s delete 7916 769 97.036us 1.340ms 21.605ms 837.003ms 13.551s Total 791992 194.943s ``` -------------------------------- ### diskcache Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_djangocache.txt Performance metrics for the diskcache backend. Includes operation counts, cache misses, and latency percentiles for get, set, and delete operations. ```text Timings for diskcache ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712770 70909 55.075us 82.016us 106.096us 36.816ms 44.088s set 71249 0 303.984us 1.489ms 6.499ms 39.687ms 49.088s delete 7973 0 228.882us 1.409ms 5.769ms 24.750ms 4.755s Total 791992 98.465s ======== ========= ========= ========= ========= ========= ========= ========= ``` -------------------------------- ### redis Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_djangocache.txt Performance metrics for the redis cache backend. Details operation counts, cache misses, and latency percentiles for get, set, and delete operations. ```text Timings for redis ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712770 71694 214.100us 267.982us 358.820us 1.556ms 155.709s set 71249 0 230.789us 284.195us 377.178us 1.462ms 16.764s delete 7973 790 195.742us 251.770us 345.945us 1.105ms 1.596s Total 791992 174.069s ======== ========= ========= ========= ========= ========= ========= ========= ``` -------------------------------- ### DjangoCache Media Handling Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Shows how to use DjangoCache with Django's cache API to serve media files efficiently using X-Accel-Redirect, ensuring files are stored in files when set with read=True. ```python from django.core.cache import cache from django.http import HttpResponse def media(request, path): try: with cache.read(path) as reader: response = HttpResponse() response['X-Accel-Redirect'] = reader.name return response except KeyError: # Handle cache miss. ``` -------------------------------- ### Queue-like Operations with Push, Pull, and Peek Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Illustrates using push, pull, and peek methods to manage a queue-like data structure within the cache. These methods automatically assign keys and support accessing either the 'front' or 'back' of the cache, with an option to use a prefix for multiple queues. ```python >>> key = cache.push('first') >>> print(key) 500000000000000 >>> cache[key] 'first' >>> _ = cache.push('second') >>> _ = cache.push('zeroth', side='front') >>> _, value = cache.peek() >>> value 'zeroth' >>> key, value = cache.pull() >>> print(key) 499999999999999 >>> value 'zeroth' ``` -------------------------------- ### filebased Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_djangocache.txt Performance metrics for the filebased cache backend. Details operation counts, cache misses, and latency percentiles for get, set, and delete operations. ```text Timings for filebased ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712792 112290 114.918us 161.171us 444.889us 61.068ms 94.438s set 71268 0 11.289ms 13.278ms 16.653ms 108.282ms 809.448s delete 7977 0 432.014us 675.917us 5.785ms 55.249ms 3.652s Total 791992 907.537s ======== ========= ========= ========= ========= ========= ========= ========= ``` -------------------------------- ### memcached Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_djangocache.txt Performance metrics for the memcached cache backend. Includes operation counts, cache misses, and latency percentiles for get, set, and delete operations. ```text Timings for memcached ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712770 71873 102.043us 118.017us 182.867us 2.054ms 73.453s set 71249 0 104.904us 123.978us 182.152us 836.849us 7.592s delete 7973 0 98.944us 114.918us 176.191us 473.261us 795.398ms Total 791992 81.841s ======== ========= ========= ========= ========= ========= ========= ========= ``` -------------------------------- ### DiskCache Index Usage Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Illustrates the use of the diskcache.Index class, which provides a mutable mapping and ordered dictionary interface. It shows initialization with key-value pairs, checking for key existence, accessing values, deleting items, and creating a new index from an existing directory. It also demonstrates popitem for removing and returning an item. ```python from diskcache import Index index = Index([('a', 1), ('b', 2), ('c', 3)]) 'b' in index index['c'] del index['a'] len(index) other = Index(index.directory) len(other) other.popitem(last=False) ``` -------------------------------- ### Running and Plotting Benchmarks Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/development.rst Instructions on how to execute benchmark scripts and plot the results. Benchmark output is saved to text files, which are then used as input for the `plot.py` script. ```bash python tests/benchmark_core.py python tests/plot.py timings_core.txt ``` -------------------------------- ### FanoutCache Initialization and Behavior Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Explains the `FanoutCache` which shards the underlying database to reduce write blocking. It discusses the `shards` parameter (default 8) and the `timeout` parameter, noting that `FanoutCache` silently fails on timeouts by default, unlike `Cache` which may raise exceptions. ```python # FanoutCache initialization example (conceptual) # from diskcache import FanoutCache # fanout_cache = FanoutCache(directory='my_fanout_cache', shards=4, timeout=10.0) ``` -------------------------------- ### DiskCache Project Links Source: https://github.com/grantjenks/python-diskcache/blob/master/README.rst Provides essential links for accessing the DiskCache project, including its official documentation, PyPI page, GitHub repository, and issue tracker. ```apidoc DiskCache Project Links: - DiskCache Documentation: http://www.grantjenks.com/docs/diskcache/ - DiskCache at PyPI: https://pypi.python.org/pypi/diskcache/ - DiskCache at GitHub: https://github.com/grantjenks/python-diskcache/ - DiskCache Issue Tracker: https://github.com/grantjenks/python-diskcache/issues/ ``` -------------------------------- ### diskcache.FanoutCache(shards=4, timeout=1.0) Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p1.txt Performance metrics for diskcache.FanoutCache configured with 4 shards and a 1.0-second timeout. Details operation counts, misses, and latency for get, set, and delete. ```APIDOC diskcache.FanoutCache(shards=4, timeout=1.0): get: Count: 89115 Miss: 8714 Median: 21.935us P90: 27.180us P99: 36.001us Max: 129.938us Total: 2.028s set: Count: 8941 Miss: 0 Median: 118.017us P90: 170.946us P99: 270.844us Max: 5.129ms Total: 1.307s delete: Count: 943 Miss: 111 Median: 91.791us P90: 153.780us P99: 231.981us Max: 4.883ms Total: 119.732ms Total Operations: 98999 Total Time: 3.455s ``` -------------------------------- ### diskcache.FanoutCache(shards=8, timeout=0.010) Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p1.txt Performance metrics for diskcache.FanoutCache configured with 8 shards and a 0.010-second timeout. Includes operation counts, misses, and latency for get, set, and delete. ```APIDOC diskcache.FanoutCache(shards=8, timeout=0.010): get: Count: 89115 Miss: 8714 Median: 20.981us P90: 27.180us P99: 35.286us Max: 128.031us Total: 2.023s set: Count: 8941 Miss: 0 Median: 116.825us P90: 175.953us P99: 269.175us Max: 5.248ms Total: 1.367s delete: Count: 943 Miss: 111 Median: 91.791us P90: 158.787us P99: 235.345us Max: 4.634ms Total: 106.991ms Total Operations: 98999 Total Time: 3.496s ``` -------------------------------- ### Diskcache Benchmark Core Script Arguments Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/development.rst This describes the command-line arguments for the `benchmark_core.py` script, used for benchmarking the core functionality of diskcache. It allows configuration of processes, operations, key range, and warmup iterations. ```bash python tests/benchmark_core.py --help ``` ```bash python tests/benchmark_core.py [-h] [-p PROCESSES] [-n OPERATIONS] [-r RANGE] [-w WARMUP] ``` -------------------------------- ### locmem Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_djangocache.txt Performance metrics for the locmem (in-memory) cache backend. Shows operation counts, cache misses, and latency percentiles for get, set, and delete operations. ```text Timings for locmem ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712770 141094 34.809us 47.922us 55.075us 15.140ms 26.159s set 71249 0 38.862us 41.008us 59.843us 8.094ms 2.725s delete 7973 0 32.902us 35.048us 51.260us 2.963ms 257.951ms Total 791992 29.142s ======== ========= ========= ========= ========= ========= ========= ========= ``` -------------------------------- ### Database Support Overview Source: https://github.com/grantjenks/python-diskcache/blob/master/README.rst This section outlines the various SQL and NoSQL databases that DiskCache can interact with or is commonly used alongside. It includes links to relevant database documentation and Python connectors. ```apidoc SQL Databases: - SQLite: Lightweight, disk-based database, part of Python's standard library. - MySQL: Popular open-source database for web applications, with Python connectors. - PostgreSQL: Powerful open-source object-relational database with Python adapters (e.g., Psycopg). - Oracle DB: Enterprise-grade relational database management system from Oracle Corporation. - Microsoft SQL Server: Relational database management system developed by Microsoft. Other Databases: - Memcached: High-performance, distributed memory object caching system. - Redis: In-memory data structure store used as a database, cache, and message broker. - MongoDB: Cross-platform document-oriented NoSQL database (uses JSON-like documents). - LMDB: Fast, memory-mapped database with high read performance. - BerkeleyDB: High-performance embedded database for key/value data. - LevelDB: Fast key-value storage library from Google, storing data sorted by key. ``` -------------------------------- ### DiskCache FanoutCache Shards Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates the creation of cache shards using FanoutCache for cross-process/thread communication. It shows how to obtain specific cache types (Cache, Deque, Index) from a FanoutCache instance, which are then stored in subdirectories. ```python from diskcache import FanoutCache fanout_cache = FanoutCache() tutorial_cache = fanout_cache.cache('tutorial') username_queue = fanout_cache.deque('usernames') url_to_response = fanout_cache.index('responses') ``` -------------------------------- ### Python DiskCache Pickle Protocol Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/sf-python-2017-meetup-talk.rst Discussion on using higher pickle protocols (up to 4) for potentially faster serialization in DiskCache, with a note on portability between Python 2 and 3 recommending protocol 2. ```python # Pickle can actually be fast if you use a higher protocol. Default 0. Up to 4 now. # Don't choose higher than 2 if you want to be portable between Python 2 and 3. ``` -------------------------------- ### Querying Persistent Results Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/case-study-web-crawler.rst Demonstrates how to access and query the persistent results stored by the DiskCache-based web crawler. It shows how to get the total number of results stored in the Index. ```python results = Index('data/results') len(results) ``` -------------------------------- ### DiskCache Item Operations Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates adding, retrieving, incrementing, decrementing, and popping items from the diskcache.Cache. Covers atomic operations and handling of missing keys. ```python >>> cache.add(b'test', 123) True >>> cache[b'test'] 123 >>> cache.add(b'test', 456) False >>> cache[b'test'] 123 >>> cache.incr(b'test') 124 >>> cache.decr(b'test', 24) 100 >>> cache.incr('alice') 1 >>> cache.decr('bob', default=-9) -10 >>> cache.incr('carol', default=None) Traceback (most recent call last): ... KeyError: 'carol' >>> cache.pop('alice') 1 >>> cache.pop('dave', default='does not exist') 'does not exist' >>> cache.set('dave', 0, expire=None, tag='admin') True >>> result = cache.pop('dave', expire_time=True, tag=True) >>> value, timestamp, tag = result >>> value 0 >>> print(timestamp) None >>> print(tag) admin ``` -------------------------------- ### diskcache.FanoutCache (shards=4, timeout=1.0) Performance Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/tests/timings_core_p8.txt Performance metrics for diskcache.FanoutCache with 4 shards and a 1.0-second timeout. This configuration shows latency and hit/miss rates for get, set, and delete operations. ```benchmark Timings for diskcache.FanoutCache(shards=4, timeout=1.0) ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712612 70432 27.895us 48.876us 77.963us 12.945ms 25.443s set 71464 0 176.907us 1.416ms 9.385ms 183.997ms 65.606s delete 7916 747 132.084us 1.354ms 9.272ms 86.189ms 6.576s Total 791992 98.248s ``` -------------------------------- ### Early Probabilistic Recomputation (Beta=0.5) Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/case-study-landing-page-caching.rst Demonstrates tuning the eagerness of recomputation by setting the `beta` parameter to 0.5 in `@dc.memoize_stampede`. This reduces the likelihood of simulated cache misses, lowering worker load while maintaining optimal latency. ```python @dc.memoize_stampede(cache, expire=1, beta=0.5) def generate_landing_page(): time.sleep(0.2) ``` -------------------------------- ### Filebased Cache Timings Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/djangocache-benchmarks.rst Performance metrics for a file-based cache, showing a higher cache miss rate due to random culling. Get and set operations are significantly slower compared to DjangoCache. ```text ========= ========= ========= ========= ========= ========= ========= ========= Timings for filebased ------------------------------------------------------------------------------- Action Count Miss Median P90 P99 Max Total ========= ========= ========= ========= ========= ========= ========= ========= get 712749 103843 112.772us 193.119us 423.908us 18.428ms 92.428s set 71431 0 8.893ms 11.742ms 14.790ms 44.201ms 646.879s delete 7812 0 223.875us 389.099us 679.016us 15.058ms 1.940s Total 791992 741.247s ========= ========= ========= ========= ========= ========= ========= ========= Notice the higher cache miss rate. That's a result of the cache's random culling strategy. Get and set operations also take three to twenty times longer in aggregate as compared with :class:`DjangoCache `. ``` -------------------------------- ### FanoutCache Usage Source: https://github.com/grantjenks/python-diskcache/blob/master/docs/tutorial.rst Demonstrates the creation and basic usage of FanoutCache with specified shards and timeout. FanoutCache automatically retries operations on Timeout errors and never raises Timeout exceptions itself. ```python from diskcache import FanoutCache cache = FanoutCache(shards=4, timeout=1) ```