org.hd.d.pg2k.svrCore.datasource
Class ExhibitDataSimpleCache

java.lang.Object
  extended by org.hd.d.pg2k.svrCore.datasource.ExhibitDataSimpleCache
All Implemented Interfaces:
SimpleExhibitPipelineIF, BasicVarMgrInterface, SimpleVariablePipelineIF

public final class ExhibitDataSimpleCache
extends java.lang.Object
implements SimpleExhibitPipelineIF

Exhibit pipeline cache stage. This performs transparent persistent cacheing of exhibit data and variables.

The presence of an instance of this stage upstream of a tunnel or other potentially slow/expensive/unreliable data source should, for normal data access (eg sequential download of exhibits) significantly reduce upstream bandwidth requirements and reduce downstream latency by answering requests from local cache.

This also attempts to shelter its downstream callers/users from I/O errors upstream, by fulfilling requests from cache, or, when synchronous calls upstream have to be made, transforming some requests and replies into async forms where possible.

When caching exhibit data this class only does so as a continuous prefix from offset zero; other (random) accesses may have to read-through the cache.

This cache is also able to precache data likely to be valuable, such as thumbnails and the initial portions of exhibits, though this will only be attempted if the cache appears to be in use. Bandwidth/resource consumption used by precacheing are regulated. This cache regards thumbnails and meta-data as precious, and tries not to let them go once collected and cached because reasonable application performance will often depend on fast access to these data.

This attempts to cache data well enough that, especially if aggressive (pre)cacheing is available and the cache area is large enough, almost no reference should be needed to the backend server in response to a data request on the pipeline except in response to the tail end of very long exhibits; all requests are answered from the local cache where possible. Ideally, it should be possible for the back-end server to go down altogether and have the front end still provide a high degree of functionality. The front-end and back-end are almost completely decoupled in this cache design.

This includes limited in-memory cacheing, in many cases using soft references to allow peaceful coexistence with other (potentially heavy) users of memory.

This class relies mainly on the asynchronous calls to poll() to fetch meta-data updates such as GenProps and AllExhibitImmutableData. These happen under a private lock and do not block cache access much or at all.

The full lock ordering (where multiple locks need to be taken at once) is:

  1. ExhibitDataSimpleCache.rwl
  2. ExhibitDataSimpleCache.metaData

This cache may serialise access to raw exhibit data (and may serialise accesses to back-end resources too). No two live instances of this class should refer to the same cache directory at once else madness and corruption will almost certainly break out.

On disc, the files are some prefix of the full exhibit, retrieved if possible in MAX_TRANSFER_CHUNK_SIZE chunks. They are touched every time accessed or updated and the timestamps can therefore be used as the basis of an LRU cache. We expect almost all access to be sequential, starting at the beginning.

We satisfy requests wholly within already-cached data immediately, and will extend (up to the limit) by up to one chunk each time by downloading from the server to satisfy requests just beyond the current end. Requests starting well beyond the current end of cache are punted directly to the server, which is rather ugly and slow, but there we go.

This cache only considers general precacheing until the low-water mark is reached, and by default only deletes existing entries if it has to in order to satisfy an incoming cacheable request. This means that stale entries for deleted/renamed exhibits may persist for a while, but this is mainly harmless.

This maintains a bidirectional cache of variable values and updates, and also some running parameters of the cache may be read as variables.

This cache supports a limited amount of peer-to-peer (P2P) data transfers to reduce load on the master. The general policy is that any synchronous (and thus presumably time-sensitive) data request from an end user that cannot be satisfied from local cache is satisfied upstream from the master. Asynchronous data fetches, such as read-ahead and precache activity, can be fetched P2P. Also if the master fails or is unavailable then it may be acceptable to use P2P if loops/cycles can be avoided.

TODO: re-analyse/reduce locking and possibly avoid locks on the metaData object.


Nested Class Summary
private static class ExhibitDataSimpleCache.CachedFile
          Object representing one (partially) cached file on disc.
private static class ExhibitDataSimpleCache.MetaData
          Cache meta-data class.
 
Nested classes/interfaces inherited from interface org.hd.d.pg2k.svrCore.datasource.SimpleExhibitPipelineIF
SimpleExhibitPipelineIF.PropsKey
 
Field Summary
private  AllExhibitProperties _AEP
          Cached AllExhibitProperties; never null.
private  boolean _aggressive
          Flag to adjust the aggressiveness of the cache; by default not aggressive.
private  java.util.Set<Name.ExhibitFull> _bestExhibits
          Contains the full exhibit names of the "best" few exhibits for enhanced precacheing; never null.
private  long _checkMetaData_notBefore
          Time before which next _checkMetaData() call should not be initiated.
private  java.util.concurrent.locks.ReentrantLock _cMD_lock_
          Private lock for _checkMetaData() to avoid starting more than one thread; non-null.
private  java.util.concurrent.locks.ReentrantLock _gAEP_lock
          Private lock for _getAllExhibitProperties()/constructor to prevent re-entry and multiple concurrent AEP fetches.
private  java.util.concurrent.locks.ReentrantLock _gTfirstTNBuildLock
          A lock to allow only one of concurrent thumbnail builds to assume unlimited resources.
private  java.util.concurrent.locks.ReentrantLock _gTFP_lock
          Private lock for _getThumbnailsFromPeer() to prevent concurrent P2P thumbnail fetches; never null.
private static int _gXL_offset
          Stack offset for _getXXXXLock() to find caller's stack frame.
private  long _handleSysVars_evSave
          Last time we saved (any) event histories, private to _handleSysVars(), initially zero.
private  java.lang.Long _handleSysVars_lastFetch
          Last time we flushed/fetched variables, private to _handleSysVars().
private  java.util.Iterator<Name.ExhibitFull> _iCMEE_iterator
          Private iterator over all cached full exhibit names for _incrCheckMRUExhibitEntries().
private  long _lastPollAEP
          Last time we polled for AllExhibitImmutableData; initially 'now' to postpone first poll.
private  long _lastPollGp
          Last time we polled for genProps.
private  long _lastPollGSp
          Last time we polled for genSecProps.
private  java.util.concurrent.locks.ReentrantLock _metadataSave_lock_
          Lock to prevent concurrent attempts to save metadata; non-null.
private  long _noMorePrecacheUntil
          Time before which we will not do more precacheing.
private  java.lang.Long _precacheExhibitHash
          Indicator for which image set we are working on.
private  java.util.Iterator<Name.ExhibitFull> _precacheIterator
          An iterator over a snapshot of all exhibit names.
private  java.util.concurrent.locks.ReentrantLock _preCacheLock
          Precache lock to prevent multi-threaded precache runs.
private  long _saveMetaData_notBefore
          Time before which not to to save metaData again; private to _cleanAndSaveMetaData().
private  MemoryTools.SoftReferenceMap<Name.ExhibitFull,java.lang.Object> _thumbnailsInMemory
          Private in-memory cache of deserialised thumbnails; never null.
private  MemoryTools.RecurrentEmergencyFreeHandle _timREFH
          If we run very low on space then discard the thumbnails and just keep the do-not-retry Long timestamps.
private  boolean _userRequestedDataFromCache
          Set true when a user requests data from the cache.
private static boolean ALLOW_DATA_FETCH_FROM_PEERS
          If true, allow us to try fetching exhibit data from peers rather than master.
private static boolean ALLOW_SYNC_TN_FETCH
          If true then allow missing thumbnails to be fetched synchronously at the risk of blocking for extened periods.
private static boolean ALLOW_TN_FETCH_FROM_PEERS
          If true, allow us to try fetching thumbnails from peers rather than only the master.
private  java.util.Map<java.lang.String,java.lang.Long> altDataSourceRating
          Thread-safe Map from mirror ID to strictly-positive rating with "" for master; never null.
private static boolean ASSUME_LOADED_METADATA_OK
          If true, assume that newly-loaded meta-data at is OK at start-up until proven otherwise.
private static java.lang.String CACHE_BASE_DIR
          Base dir within cache dir for all our exhibit data.
private static java.lang.String CACHE_EXAUX_PREFIX
          The prefix for all aux files associated with an exhibit file.
private static java.lang.String CACHE_EXAUX_TIMESTAMP_KW
          The keyword for the file containing the (decimal) exhibit timestamp.
private static java.lang.String CACHE_EXAUX_TN_KW
          The keyword for the file containing the serialised thumbnails object.
private static java.lang.String CACHE_EXDATA_DIR
          Base dir within cache dir for all our raw exhibit content data.
private static java.lang.String CACHE_EXPROPS_FILENAME
          Name of file in which to persist immutable exhibit data.
private static java.lang.String CACHE_METADATA_FILENAME
          Name of file in which to persist cache meta data.
private  java.io.File cacheDir
          The cache dir, else null.
private static boolean CALC_MISSING_EPCM_DURING_PRECACHE
          If true then try to at least partially compute EPCM while precacheing.
private  long consTime
          Time of construction.
private  boolean destroyed
          Set true once destroy() is called; never set false again.
private static int DISC_RECHECK_INTERVAL_MS
          Approximate minimum interval between rechecks of on-disc cache.
private  java.util.concurrent.ThreadPoolExecutor discardableReadAheadTaskThreadPool
          Shared thread pool for I/O-bound activities (for thumbnail fetching).
private static java.lang.String EVENT_HISTORY_DIR
          Base dir within cache dir for all our event history data.
private  AllExhibitProperties.ExhibitDataSource exhibitDataSource
          An AllExhibitProperties.ExhibitDataSource wrapping ourselves; never null.
private static int FALLBACK_MIN_CACHE_SIZE
          Minimum cache size to allow if GenProps is not set (bytes).
private static boolean FORCE_IMMEDIATE_SAVE_ON_EXPANDED_METADATA
          If true, synchronously force a save of meta-data each time we add a new entry at least.
private  GenProps genProps
          Our record of the current GenProps; never null.
private  java.util.Properties genSecProps
          Our record of the current GenProps; never null.
private static java.lang.String KEY_debugFlag_P2P_BLOCKXFER
          Key in generic props of P2P-profiling flag.
private  SimpleLoggerIF logger
          Our local logger; never null.
private static float LOW_WATER_FRACTION
          Fraction of max cache size that is the low-water mark.
private static java.lang.String MASTER_FAKE_TAG
          Fake tag we use to indicate a fetch from the master/upstream via the pipeline.
private static int MAX_BEST_EX_PRECACHED
          Maximum number of "best" exhibits to get enhanced precaching; non-negative.
private  int MAX_dPC_BACKOFF_TIME
          Max time _doPreCache() has to sleep for (ms).
private  int MAX_dPC_SPIN_TIME_MS
          Maximum time that _doPreCache() can spend in one go (ms).
private static int MAX_EXTD_TRANSFER_CHUNK_SIZE
          Maximum extended exhibit data transfer chunk size (bytes); strictly positive.
private static int MAX_LOCK_ATTEMPTS
          Maximum number of consecutive attempts to obtain lock for read or write; strictly positive.
private static int MAX_QUEUED_TN_FETCHES
          Maximum number of async thumbnail fetches to queue; strictly positive.
private static int MAX_REMOTE_FETCH_TO_MAKE_THUMBNAIL
          The maximum number of bytes we will force a transfer of to force an immediate thumbnail generation.
private static int MAX_THREADS_aTWQ
          Maximum number of threads that may run in _asyncTNFetch() and other local discardable data read-ahead tasks; strictly positive.
private static int MAX_TRANSFER_CHUNK_SIZE
          Maximum (normal) exhibit data transfer chunk size (bytes); strictly positive.
private static long MAX_WAIT_BETWEEN_THUMBNAIL_REPEAT_FETCHES_MS
          Maximum wait time between attempts to fetch or generate thumbnails (ms) by long-running cache; strictly positive.
private  ExhibitDataSimpleCache.MetaData metaData
          In-memory copy of whole-cache meta-data; never null.
private static int METADATA_MIN_SAVE_INTERVAL_MS
          Approximate minimum interval between saves of the metadata; strictly positive.
private static int MIN_AEP_POLL_TIME_UNTIL_LOADED_MS
          Minimum time before attempting to poll again for AEP while we don't have a real one loaded (ms).
private static int NEARLY_TOP_FACTOR
          Factor/multiplier of peers worse than top that will be considered for fetches routinely; strictly positive.
private static long NORMAL_WAIT_BETWEEN_THUMBNAIL_REPEAT_FETCHES_MS
          Normal wait time between attempts to fetch or generate thumbnails (ms) by long-running cache; strictly positive.
private static boolean ORPHANED_EXHIBIT_EXPIRY_ALLOWED
          If true then we may purge cached data for exhibits that appear to have been deleted (or renamed).
private static long ORPHANED_EXHIBIT_MIN_UNUSED_TIME_MS
          The minimum time before we will preemptively purge orphaned cache entries (ms); strictly positive.
private  int P2P_NEXT_BEST_FRAC
          Fraction of the time to choose a 2nd-tier peer rather than the best peer; strictly positive.
private  int P2P_RND_FRAC
          Fraction of the time to pick a peer completely at random; strictly positive.
private static boolean PEER_SELECTION_CAUTIOUS
          If true then use a cautious strategy to select a peer to talk to.
private static int PEER_STATS_TC
          Time-constant for updating peer fetch time value; strictly positive.
private static int PEER_STATS_UNKNOWN_MS
          Default rating/time (ms) for "unknown" data source/mirror/peer; strictly positive.
private static boolean PREFER_PEERS_TO_MASTER_WHERE_POSSIBLE
          If true, avoid use of master where peers are available.
private  java.util.concurrent.locks.ReentrantReadWriteLock rwl
          The read/write lock for the whole cache except system variables; never null.
static java.lang.String SCGNAME_CACHE_VALIDATION
          General stats event name: an exhibit in the cache was fully validated against checksums, etc.
static java.lang.String SCGNAME_CACHE_VALIDATION_PART
          General stats event name: an exhibit in the cache was partially validated against checksums, etc.
static java.lang.String SCGNAME_CACHEADD
          General stats event name: an exhibit was added to the cache.
static java.lang.String SCGNAME_CACHEEVICTLRU
          General stats event name: an exhibit was evicted from the cache in LRU order.
static java.lang.String SCGNAME_CACHERAWDATAHIT
          General stats event name: cache raw data read hit.
static java.lang.String SCGNAME_CACHERAWDATAMISS
          General stats event name: cache raw data read miss.
static java.lang.String SCGNAME_CACHEREM
          General stats event name: an exhibit was removed from the cache.
static java.lang.String SCGNAME_CACHEREM_CORRUPT
          General stats event name: a corrupt exhibit was removed from the cache.
static java.lang.String SCGNAME_CACHETNHIT
          General stats event name: on-disc cache hit for thumbnail.
static java.lang.String SCGNAME_CACHETNMEMHIT
          General stats event name: in-memory cache hit for thumbnail.
static java.lang.String SCGNAME_CACHETNMISS
          General stats event name: in-memory cache hit for thumbnail.
static java.lang.String SCGNAME_DATAFETCHFROMPEER_PREFIX
          General stats event name: fetched a data block from a peer.
static java.lang.String SCGNAME_EXDATAREQIN
          General stats event name: incoming request for exhibit data.
static java.lang.String SCGNAME_EXDATAREQINDC
          General stats event name: incoming request for exhibit data with "dontCache" flag set.
static java.lang.String SCGNAME_EXTHUCREATED
          General stats event name: created thumbnails locally from cached data.
static java.lang.String SCGNAME_EXTHUREQIN
          General stats event name: incoming request for exhibit thumbnails.
static java.lang.String SCGNAME_EXTHUREQINDC
          General stats event name: incoming request for exhibit thumbnails with "dontCreate" flag set.
static java.lang.String SCGNAME_MDSAVE
          General stats event name: an exhibit was evicted from the cache in LRU order.
static java.lang.String SCGNAME_PRECACHEERROR
          General stats event name: errors encountered during precaching.
static java.lang.String SCGNAME_PRECACHEEXAMINED
          General stats event name: exhibits examined for precaching.
static java.lang.String SCGNAME_PRECACHEEXDATABLOCK
          General stats event name: exhibit data block precached.
static java.lang.String SCGNAME_PRECACHERESTART
          General stats event name: restarted scanning all exhibits for precaching.
static java.lang.String SCGPREF_PRECACHEEXDATABLOCKFETCHTIME
          General stats event name prefix: exhibit data block precache (succesful) fetch time (log2 ms).
static java.lang.String SCGPREF_PRECACHEEXDATABLOCKSRC
          General stats event name prefix: exhibit data block precache source (if not from master/upstream).
static java.lang.String SCGPREF_PRECACHEEXDATABLOCKSRCERR
          General stats event name prefix: exhibit data block precache source for error (if not from master/upstream).
private  SimpleExhibitPipelineIF source
          The upstream data source; never null.
private  StatsLogger.StatsConfig statsIDSCGEN
          The stats set to which we log general cache behaviour.
private static boolean STORE_EXPROPS_GZIPED
          If true, store exprops data (and cache metadata) GZIPed to possibly save space and I/O time.
private static boolean THUMBNAIL_ACCESS_UPDATES_ACCESS_TIMESTAMP
          If true then accessing a thumbnail marks its exhibit as accessed.
private static boolean TRACE_P2P_ACTIVITY
          If true then trace/log P2P activity; defaults to false/off.
private static boolean TRACE_THUMBNAIL_ACTIVITY
          If true then trace/log interesting/unusual thumbnail activity; defaults to false/off.
private static boolean TRACE_THUMBNAIL_ACTIVITY_ALL
          If true then trace/log all thumbnail activity; defaults to false/off.
private  Stratum upstreamStratum
          Our stratum cached; never null though may be UNKNOWN.
private static int VAR_CACHE_HOLD_TIME_MS
          Variable flush/retrieval interval (ms); strictly positive.
private  PipelineVarMgr varMgr
          Manages our local cache of variables, etc; never null.
 
Fields inherited from interface org.hd.d.pg2k.svrCore.datasource.SimpleExhibitPipelineIF
MAX_USER_READ_SIZE
 
Constructor Summary
private ExhibitDataSimpleCache(SimpleExhibitPipelineIF dataSource, java.io.File cacheDir, SimpleLoggerIF logger)
          Wrap a new cache instance around a data source.
 
Method Summary
private  void _asyncTNFetch(Name.ExhibitFull exhibitName)
          Attempt to asynchronously fetch/create thumbnail that we have failed to return to the user.
private  void _checkMetaData()
          Initiates, in the background, a check of the in-memory cache meta data against disc.
private  void _cleanAndSaveMetaData(boolean force)
          Saves the cache metadata if needed.
private  void _doCacheDataValidityTest(AllExhibitProperties aep, ExhibitStaticAttr esa)
          Partly check the cache data (including metadata, tns, etc) for validity.
private  void _doPreCache(GenProps gp)
          Routine to do incremental pre-cacheing.
private  void _getAllExhibitProperties_postUpdate(AllExhibitProperties new_AEP, boolean inCons)
          Accepts a (new) AEP value posted from a background thread.
private  void _getAllExhibitProperties()
          Attempts to get all exhibit properties if our cached copy may be stale.
private  void _getExhibitDataFromUpstreamToPrecache(ExhibitStaticAttr esa, AllExhibitImmutableData aeid, GenProps gp, long start, int len, boolean forceIt)
          Try to extend cached data for the specified exhibit.
private  void _getGenProps()
          Attempts to get sysprops if our cached copy may be stale.
private  void _getGenSecProps()
          Attempts to get gensecprops if our cached copy may be stale.
(package private) static int _getMaximumCacheableBytesForOneExhibit(GenProps gp)
          Computes the maximum number of bytes to cache from (the start of) any one exhibit; strictly positive.
private static void _getReadLock(java.util.concurrent.locks.ReentrantReadWriteLock rwl, java.lang.String detail, SimpleLoggerIF logger)
          Get a cache read lock, complaining/aborting if we have to wait for a long time.
private  ExhibitThumbnails _getThumbnails(Name.ExhibitFull name, boolean create, boolean allowSyncFetch)
          Get the thumbnails for an exhibit; null if not available.
private  ExhibitThumbnails _getThumbnailsFromPeer(Name.ExhibitFull name)
          Attempt to fetch the specified thumbnails from any peer; may be null if currently unavailable.
private static void _getWriteLock(java.util.concurrent.locks.ReentrantReadWriteLock rwl, java.lang.String detail, SimpleLoggerIF logger)
          Get the cache write lock, complaining/aborting if we have to wait for a long time.
private  void _handleSysVars(boolean force)
          Handle (update, sync, persist) system variables as required.
private  void _incrCheckMRUExhibitEntries(java.util.Set<Name.ExhibitFull> done)
          Incrementally check cached exhibits for integrity.
private  void _incrPurgeOrphanedExhibits()
          Do incremental purge of orphaned cache entries conditions are right.
private  long _instanceLifems()
          How long this instance has been alive in milliseconds.
private  java.lang.String _pickPeer()
          Pick a peer to attempt to fetch exhibit data from; never null.
private  java.lang.String _pickPeer(java.util.Set<java.lang.String> activeMirrors)
          Pick one of the supplied peers to attempt to fetch exhibit data from; never null.
private  void _removeCorruptData(ExhibitDataSimpleCache.CachedFile cf)
          Remove cached exhibit data identified as corrupt.
private  void _save_AEP()
          Save AllExhibitProperties to disc.
private  boolean _updateOneExhibit(ExhibitStaticAttr esa, GenProps gp, AllExhibitProperties aep, boolean forceDataFetch)
          Update one exhibit incrementally during precaching.
private  void _updatePeerStats(java.lang.String peer, boolean fetchSuccessful, long timeTaken)
          Update data-transfer stats for the given peer.
static ExhibitDataSimpleCache cacheFactory(SimpleExhibitPipelineIF dataSource, java.io.File cacheDir, SimpleLoggerIF logger)
          Get the an instance copy of this class; may be a singleton.
 void destroy()
          Shut down the data pipeline.
 AllExhibitImmutableData getAllExhibitImmutableData(long oldStamp)
          Gets all static exhibit data if its timestamp is not that specified.
 AllExhibitProperties getAllExhibitProperties(long oldHash)
          Gets set of all exhibit properties if its hash is not that specified.
 EventVariableValue getEventValue(SimpleVariableDefinition def, EventPeriod intervalSelector, boolean current)
          Get the current partial, or previous full, event set at the specified interval; never returns null.
 EventVariableValue[] getEventValues(SimpleVariableDefinition def, EventPeriod intervalSelector, long intervalNumber, java.util.BitSet whichValues)
          Get the specified event sets for the specified intervals; never null.
 GenProps getGenProps(long oldStamp)
          Gets the general properties as a GenProps object if its timestamp is not that specified.
 java.util.Properties getGenSecProps(long oldStamp)
          Gets the security properties as a Properties object if its timestamp is not that specified.
 int getLiveCachedExhibitCount()
          Return directly the number of partly- or fully- cached exhibits; never negative.
 java.util.Properties getProperties(SimpleExhibitPipelineIF.PropsKey key, long versionID)
          Get requested Properties selected by key and versionID.
 void getRawFile(java.nio.ByteBuffer buf, Name.ExhibitFull exhibitName, int position, boolean dontCache)
          Get a chunk of the raw exhibit binary.
 ExhibitStaticAttr getStaticAttr(Name.ExhibitFull name)
          Get the static attributes for a given exhibit; null if no such exhibit.
 Stratum getStratum()
          Return cached stratum; never null.
 ExhibitThumbnails getThumbnails(Name.ExhibitFull name, boolean create)
          Gets the thumbnails for an exhibit.
 SimpleVariableValue getVariable(SimpleVariableDefinition var)
          Get a single variable value; returns null if no such value or wrong type.
 SimpleVariableValue[] getVariables(long changedSince)
          Get immutable Set of variable values altered on or after a given time, or all for -1.
private  boolean isPopularDownload(ExhibitStaticAttr esa)
          In the top-N (global) downloads recently, with at least 2 downloads.
 void poll(GenProps _gp)
          Poll periodically (of the order of a second) to do cache maintenance.
static void rmCache(java.io.File cacheDir)
          Remove a persistent cache.
 void setAggressive(boolean isAggressive)
          Set the aggressiveness of the cache; by default not aggressive.
 void setVariable(SimpleVariableValue newValue)
          Set variable.
 int setVariables(SimpleVariableValue[] newValues)
          Update number of variables at once for efficiency.
 void syncVariables(boolean force)
          Synchronise with upstream values.
private  boolean upstreamSourceIsLocal()
          Returns true iff upstream is local disc so some operations should be cheap without cacheing.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TRACE_THUMBNAIL_ACTIVITY_ALL

private static final boolean TRACE_THUMBNAIL_ACTIVITY_ALL
If true then trace/log all thumbnail activity; defaults to false/off. Usually only true while debugging/tuning.

Can be activated from the command-line.


TRACE_THUMBNAIL_ACTIVITY

private static final boolean TRACE_THUMBNAIL_ACTIVITY
If true then trace/log interesting/unusual thumbnail activity; defaults to false/off. Usually only true while debugging/tuning.

Generally shows rare events such as generation of thumbnails and request to save NO_THUMBNAILS values and other non-routine activity.

Forced to true if TRACE_THUMBNAIL_ACTIVITY_ALL is true.

Can be activated from the command-line.


TRACE_P2P_ACTIVITY

private static final boolean TRACE_P2P_ACTIVITY
If true then trace/log P2P activity; defaults to false/off. Usually only true while debugging/tuning.

Can be activated from the command-line.


ORPHANED_EXHIBIT_EXPIRY_ALLOWED

private static final boolean ORPHANED_EXHIBIT_EXPIRY_ALLOWED
If true then we may purge cached data for exhibits that appear to have been deleted (or renamed). These are exhibits that have been "orphaned" in the cache, ie they are not accessible because they do not logically exist (though they do potentially serve as a backup in case of disaster).

We will generally only do this if:

We do not rush to delete exhibits' data in case a transient problem has made an exhibit disappear temporarily. (Data for deleted exhibits will in any case eventually be deleted LRU (Least-Recently-Used) if the cache becomes full.)

See Also:
Constant Field Values

ORPHANED_EXHIBIT_MIN_UNUSED_TIME_MS

private static final long ORPHANED_EXHIBIT_MIN_UNUSED_TIME_MS
The minimum time before we will preemptively purge orphaned cache entries (ms); strictly positive. We don't expect to delete or rename exhibits very often, and the only harm in NOT purging them may be to prevent precacheing of new exhibits, ie a minor performance issue rather than a correctness issue.

Add a random component so that all clients do not purge orphans at once!

A value of the order of a few days to a few months is probably reasonable.


CACHE_BASE_DIR

private static final java.lang.String CACHE_BASE_DIR
Base dir within cache dir for all our exhibit data.

See Also:
Constant Field Values

CACHE_EXPROPS_FILENAME

private static final java.lang.String CACHE_EXPROPS_FILENAME
Name of file in which to persist immutable exhibit data.

See Also:
Constant Field Values

CACHE_METADATA_FILENAME

private static final java.lang.String CACHE_METADATA_FILENAME
Name of file in which to persist cache meta data.

See Also:
Constant Field Values

STORE_EXPROPS_GZIPED

private static final boolean STORE_EXPROPS_GZIPED
If true, store exprops data (and cache metadata) GZIPed to possibly save space and I/O time.

See Also:
Constant Field Values

CACHE_EXDATA_DIR

private static final java.lang.String CACHE_EXDATA_DIR
Base dir within cache dir for all our raw exhibit content data.

See Also:
Constant Field Values

EVENT_HISTORY_DIR

private static final java.lang.String EVENT_HISTORY_DIR
Base dir within cache dir for all our event history data.

See Also:
Constant Field Values

CACHE_EXAUX_PREFIX

private static final java.lang.String CACHE_EXAUX_PREFIX
The prefix for all aux files associated with an exhibit file.

See Also:
Constant Field Values

CACHE_EXAUX_TIMESTAMP_KW

private static final java.lang.String CACHE_EXAUX_TIMESTAMP_KW
The keyword for the file containing the (decimal) exhibit timestamp.

See Also:
Constant Field Values

CACHE_EXAUX_TN_KW

private static final java.lang.String CACHE_EXAUX_TN_KW
The keyword for the file containing the serialised thumbnails object.

See Also:
Constant Field Values

MAX_TRANSFER_CHUNK_SIZE

private static final int MAX_TRANSFER_CHUNK_SIZE
Maximum (normal) exhibit data transfer chunk size (bytes); strictly positive. Maximum chunk transferred in one call (bytes), to avoid creating huge gaps in other activity by jamming up transactions and/or hogging all I/O bandwidth.

Should probably be of the order of a few tens of kBytes to allow efficient transfers on the wire, and a power of two to be more likely to interact efficiently with other caches (and network protocols).

If we use (a small multiple of) the bulk data transfer chunk size this will be reasonably efficient in terms of disc/network traffic, and if we can also keep it aligned to whole chunk boundaries then we may get maximally efficient accesses into disc (etc) data.


MAX_EXTD_TRANSFER_CHUNK_SIZE

private static final int MAX_EXTD_TRANSFER_CHUNK_SIZE
Maximum extended exhibit data transfer chunk size (bytes); strictly positive. When asked for data but we have a small gap in our cache before the request start normally we would have to pass the result upstream and not cache any. This is potentially very wasteful in P2P sharing of new exhibits.

So in this case we may allow the upstream request window to be moved back to patch up the hole from the end of our cached data to the start of the request, thus allowing us to capture the result.

This value should be a small multiple of the MAX_TRANSFER_CHUNK_SIZE.


MAX_REMOTE_FETCH_TO_MAKE_THUMBNAIL

private static final int MAX_REMOTE_FETCH_TO_MAKE_THUMBNAIL
The maximum number of bytes we will force a transfer of to force an immediate thumbnail generation. This might as well be at least one block, and might as well be a little bigger than we anticipate the resulting thumbnail size to be (limited to), but no larger than the maximum single transfer that we can make.


FALLBACK_MIN_CACHE_SIZE

private static final int FALLBACK_MIN_CACHE_SIZE
Minimum cache size to allow if GenProps is not set (bytes). This prevents thrashing on an empty cache and should be enough to store a few thumbnails and a few reasonable image chunks.


THUMBNAIL_ACCESS_UPDATES_ACCESS_TIMESTAMP

private static final boolean THUMBNAIL_ACCESS_UPDATES_ACCESS_TIMESTAMP
If true then accessing a thumbnail marks its exhibit as accessed. This means that for the purposes of avoiding eviction from cache, accessing a thumbnail is taken as being as significant as downloading (part of) the exhibit itself, if true.

If false then only actually reading part of the exhibit itself helps keep the exhibit ``fresh'' in the cache.

See Also:
Constant Field Values

DISC_RECHECK_INTERVAL_MS

private static final int DISC_RECHECK_INTERVAL_MS
Approximate minimum interval between rechecks of on-disc cache. When a check is done, the in-memory record of disc cache status is reloaded from disc, any debris is removed, etc.

A period of the order of at least a day is probably about right; not being exactly a multiple helps to ensure that we do not hit the same time every day, which might otherwise collide with other regular activity. Note that this recheck may take at least several minutes, so we don't want to do it too often!

The chosen interval is less than 2D (including a random component). We use less than a day-multiple to tend to be waiting for energy to become available at a similar time each day, eg from solar PV, when the system is in energy-conserving mode. This seems frequent enough in practice.


METADATA_MIN_SAVE_INTERVAL_MS

private static final int METADATA_MIN_SAVE_INTERVAL_MS
Approximate minimum interval between saves of the metadata; strictly positive. Since access to the exhibit/thumbnail data causes this to be updated (along with more significant changes to the cache), and saving can take significant time, we do not want to save this immediately we encounter a change.

We can postpone a save for a while at the risk that if the system crashes or shuts down during that time and there was a structural change to the cache, we might have to abandon the old metadata and start again, which could be slow and a bit messy (losing some useful though not vital information).

Taking our cue from the old UNIX sync interval of 30s, a value in the range 30s to a few minutes is probably reasonable. Larger values of several minutes help reduce disc (write) activity which may be important for (say) solid-state storage such as Flash.


logger

private final SimpleLoggerIF logger
Our local logger; never null.


statsIDSCGEN

private final StatsLogger.StatsConfig statsIDSCGEN
The stats set to which we log general cache behaviour. The unique codes are the constants SCGNAME_XXX.


SCGNAME_MDSAVE

public static final java.lang.String SCGNAME_MDSAVE
General stats event name: an exhibit was evicted from the cache in LRU order.

See Also:
Constant Field Values

SCGNAME_CACHEEVICTLRU

public static final java.lang.String SCGNAME_CACHEEVICTLRU
General stats event name: an exhibit was evicted from the cache in LRU order.

See Also:
Constant Field Values

SCGNAME_CACHEREM

public static final java.lang.String SCGNAME_CACHEREM
General stats event name: an exhibit was removed from the cache.

See Also:
Constant Field Values

SCGNAME_CACHEADD

public static final java.lang.String SCGNAME_CACHEADD
General stats event name: an exhibit was added to the cache.

See Also:
Constant Field Values

SCGNAME_CACHEREM_CORRUPT

public static final java.lang.String SCGNAME_CACHEREM_CORRUPT
General stats event name: a corrupt exhibit was removed from the cache.

See Also:
Constant Field Values

SCGNAME_CACHE_VALIDATION

public static final java.lang.String SCGNAME_CACHE_VALIDATION
General stats event name: an exhibit in the cache was fully validated against checksums, etc.

See Also:
Constant Field Values

SCGNAME_CACHE_VALIDATION_PART

public static final java.lang.String SCGNAME_CACHE_VALIDATION_PART
General stats event name: an exhibit in the cache was partially validated against checksums, etc.

See Also:
Constant Field Values

SCGNAME_CACHERAWDATAMISS

public static final java.lang.String SCGNAME_CACHERAWDATAMISS
General stats event name: cache raw data read miss. We had to go upstream for at least part of the data.

(It is possible to have a hit and a miss on the same read if part is satisfied from cache and part not.)

See Also:
Constant Field Values

SCGNAME_CACHERAWDATAHIT

public static final java.lang.String SCGNAME_CACHERAWDATAHIT
General stats event name: cache raw data read hit. We satisfied at least part of the read from cache.

(It is possible to have a hit and a miss on the same read if part is satisfied from cache and part not.)

See Also:
Constant Field Values

SCGNAME_CACHETNHIT

public static final java.lang.String SCGNAME_CACHETNHIT
General stats event name: on-disc cache hit for thumbnail.

See Also:
Constant Field Values

SCGNAME_CACHETNMEMHIT

public static final java.lang.String SCGNAME_CACHETNMEMHIT
General stats event name: in-memory cache hit for thumbnail.

See Also:
Constant Field Values

SCGNAME_CACHETNMISS

public static final java.lang.String SCGNAME_CACHETNMISS
General stats event name: in-memory cache hit for thumbnail.

See Also:
Constant Field Values

SCGNAME_DATAFETCHFROMPEER_PREFIX

public static final java.lang.String SCGNAME_DATAFETCHFROMPEER_PREFIX
General stats event name: fetched a data block from a peer.

See Also:
Constant Field Values

SCGNAME_PRECACHERESTART

public static final java.lang.String SCGNAME_PRECACHERESTART
General stats event name: restarted scanning all exhibits for precaching.

See Also:
Constant Field Values

SCGNAME_PRECACHEEXAMINED

public static final java.lang.String SCGNAME_PRECACHEEXAMINED
General stats event name: exhibits examined for precaching.

See Also:
Constant Field Values

SCGNAME_PRECACHEEXDATABLOCK

public static final java.lang.String SCGNAME_PRECACHEEXDATABLOCK
General stats event name: exhibit data block precached.

See Also:
Constant Field Values

SCGPREF_PRECACHEEXDATABLOCKSRC

public static final java.lang.String SCGPREF_PRECACHEEXDATABLOCKSRC
General stats event name prefix: exhibit data block precache source (if not from master/upstream).

See Also:
Constant Field Values

SCGPREF_PRECACHEEXDATABLOCKSRCERR

public static final java.lang.String SCGPREF_PRECACHEEXDATABLOCKSRCERR
General stats event name prefix: exhibit data block precache source for error (if not from master/upstream).

See Also:
Constant Field Values

SCGPREF_PRECACHEEXDATABLOCKFETCHTIME

public static final java.lang.String SCGPREF_PRECACHEEXDATABLOCKFETCHTIME
General stats event name prefix: exhibit data block precache (succesful) fetch time (log2 ms).

See Also:
Constant Field Values

SCGNAME_PRECACHEERROR

public static final java.lang.String SCGNAME_PRECACHEERROR
General stats event name: errors encountered during precaching.

See Also:
Constant Field Values

SCGNAME_EXDATAREQIN

public static final java.lang.String SCGNAME_EXDATAREQIN
General stats event name: incoming request for exhibit data.

See Also:
Constant Field Values

SCGNAME_EXDATAREQINDC

public static final java.lang.String SCGNAME_EXDATAREQINDC
General stats event name: incoming request for exhibit data with "dontCache" flag set.

See Also:
Constant Field Values

SCGNAME_EXTHUREQIN

public static final java.lang.String SCGNAME_EXTHUREQIN
General stats event name: incoming request for exhibit thumbnails.

See Also:
Constant Field Values

SCGNAME_EXTHUREQINDC

public static final java.lang.String SCGNAME_EXTHUREQINDC
General stats event name: incoming request for exhibit thumbnails with "dontCreate" flag set.

See Also:
Constant Field Values

SCGNAME_EXTHUCREATED

public static final java.lang.String SCGNAME_EXTHUCREATED
General stats event name: created thumbnails locally from cached data.

See Also:
Constant Field Values

ASSUME_LOADED_METADATA_OK

private static final boolean ASSUME_LOADED_METADATA_OK
If true, assume that newly-loaded meta-data at is OK at start-up until proven otherwise.

See Also:
Constant Field Values

FORCE_IMMEDIATE_SAVE_ON_EXPANDED_METADATA

private static final boolean FORCE_IMMEDIATE_SAVE_ON_EXPANDED_METADATA
If true, synchronously force a save of meta-data each time we add a new entry at least. May be slow (O(n^2) for n exhibits), especially where the cache is not large enough to hold all exhibits so items are continually coming and going, but potentially makes the system more robust against loss of data.

See Also:
Constant Field Values

KEY_debugFlag_P2P_BLOCKXFER

private static final java.lang.String KEY_debugFlag_P2P_BLOCKXFER
Key in generic props of P2P-profiling flag.

See Also:
Constant Field Values

MAX_BEST_EX_PRECACHED

private static final int MAX_BEST_EX_PRECACHED
Maximum number of "best" exhibits to get enhanced precaching; non-negative.

See Also:
Constant Field Values

_bestExhibits

private final java.util.Set<Name.ExhibitFull> _bestExhibits
Contains the full exhibit names of the "best" few exhibits for enhanced precacheing; never null. Maintained/updated by _doPrecache().

This is a snapshot of what the "best" exhibits are estimated to be as each precacheing round starts, often based on "quick approximation" data.

The implementation is optimised for fast lookup with "contains()".

Thread-safe.


metaData

private final ExhibitDataSimpleCache.MetaData metaData
In-memory copy of whole-cache meta-data; never null. Note that the read/write status may change at any time.

The instance is never replaced; the state is replaced in-situ if need be to ensure that we never have two instances of this that believe they control the disc cache.


source

private final SimpleExhibitPipelineIF source
The upstream data source; never null.


cacheDir

private final java.io.File cacheDir
The cache dir, else null. If this is not a valid dir at class creation time we ensure that we save a null here.


rwl

private final java.util.concurrent.locks.ReentrantReadWriteLock rwl
The read/write lock for the whole cache except system variables; never null. Any access that may update the cache state in memory or on disc must be protected by a write lock.

Any access that may just read the cache need only have a read lock.

Most accesses may have to start by taking a write lock (for example because they may have to fetch data from upstream and insert it into the cache) but can downgrade it to a read lock as soon as they know that they will not be altering the cache at all or any further.

The main exception is any state component held in volatile fields.

Note that the variable store is internally thread-safe and does not require protection by this lock.

Ideally we want performance (ie best throughput) rather than fairness, but starvation of some users is not good.

We generally do not want lock attempts to block forever, which means that we give up attempting to obtain a lock after a given number of attempts (with a maximum time per attempt). While this is intended to prevent to limit delays in the face of I/O problems, this may rescue us from logic errors in extremis.


MAX_LOCK_ATTEMPTS

private static final int MAX_LOCK_ATTEMPTS
Maximum number of consecutive attempts to obtain lock for read or write; strictly positive.

See Also:
Constant Field Values

_gXL_offset

private static final int _gXL_offset
Stack offset for _getXXXXLock() to find caller's stack frame.

See Also:
Constant Field Values

exhibitDataSource

private final AllExhibitProperties.ExhibitDataSource exhibitDataSource
An AllExhibitProperties.ExhibitDataSource wrapping ourselves; never null.


MAX_THREADS_aTWQ

private static final int MAX_THREADS_aTWQ
Maximum number of threads that may run in _asyncTNFetch() and other local discardable data read-ahead tasks; strictly positive. We limit the amount of threading by: This limit/count/cap should generally be >1 since the work is mainly I/O bound and may be subject to significant latency, but should generally be not much more than (say) half the maximum simultaneous outbound tunnel HTTP connection count since overuse of concurrency for such connections may be vetoed anyway.

See Also:
Constant Field Values

MAX_QUEUED_TN_FETCHES

private static final int MAX_QUEUED_TN_FETCHES
Maximum number of async thumbnail fetches to queue; strictly positive. Enough to allow all of (say) one of the 'new' or 'best' pages' thumbnails to be queued.

See Also:
Constant Field Values

discardableReadAheadTaskThreadPool

private final java.util.concurrent.ThreadPoolExecutor discardableReadAheadTaskThreadPool
Shared thread pool for I/O-bound activities (for thumbnail fetching). Suitable for mainly-I/O-bound threads, thus we have a fixed thread limit. This ceiling also protects upstream servers from excess load.

A limited amount of work can be queued, but excess is handled by discarding the oldest queued items silently.

The threads in the pool are daemon threads, so will not prevent the JVM from exiting.

All threads can time out (and thus release resources) when idle.


_gAEP_lock

private final java.util.concurrent.locks.ReentrantLock _gAEP_lock
Private lock for _getAllExhibitProperties()/constructor to prevent re-entry and multiple concurrent AEP fetches.


MIN_AEP_POLL_TIME_UNTIL_LOADED_MS

private static final int MIN_AEP_POLL_TIME_UNTIL_LOADED_MS
Minimum time before attempting to poll again for AEP while we don't have a real one loaded (ms).


_AEP

private volatile AllExhibitProperties _AEP
Cached AllExhibitProperties; never null. Volatile so that it can be safely accessed without a lock.


_lastPollAEP

private transient volatile long _lastPollAEP
Last time we polled for AllExhibitImmutableData; initially 'now' to postpone first poll. Private to _getAllExhibitProperties().


_checkMetaData_notBefore

private transient volatile long _checkMetaData_notBefore
Time before which next _checkMetaData() call should not be initiated. The initial check is usually put off a few minutes since the system is often very busy on start-up, and we don't expect significant problems anyway most of the time.

Volatile for thread-safe access without a lock.

Private to _checkMetaData() and _checkMetaData_postResults().


_cMD_lock_

private final java.util.concurrent.locks.ReentrantLock _cMD_lock_
Private lock for _checkMetaData() to avoid starting more than one thread; non-null.


_saveMetaData_notBefore

private transient long _saveMetaData_notBefore
Time before which not to to save metaData again; private to _cleanAndSaveMetaData().


_metadataSave_lock_

private final java.util.concurrent.locks.ReentrantLock _metadataSave_lock_
Lock to prevent concurrent attempts to save metadata; non-null.


_iCMEE_iterator

private transient java.util.Iterator<Name.ExhibitFull> _iCMEE_iterator
Private iterator over all cached full exhibit names for _incrCheckMRUExhibitEntries(). Must only be accessed under a lock on the metadata object to prevent concurrent/unsafe access to the iterator object. The underlying data being iterated over is guaranteed not to change, though may become stale wrt the metadata and cache, so some items returned by the iterator may no longer be relevant.

May be null.

Marked transient to avoid being serialised.


_lastPollGp

private transient volatile long _lastPollGp
Last time we polled for genProps. Private to _getGenProps() and is volatile to avoid needing locked access.


_lastPollGSp

private transient volatile long _lastPollGSp
Last time we polled for genSecProps. Private to _getGenSecProps(); is volatile to avoid the need for locking.


genSecProps

private volatile java.util.Properties genSecProps
Our record of the current GenProps; never null. Maintained by poll(); is volatile to avoid the need for locking.


MAX_WAIT_BETWEEN_THUMBNAIL_REPEAT_FETCHES_MS

private static final long MAX_WAIT_BETWEEN_THUMBNAIL_REPEAT_FETCHES_MS
Maximum wait time between attempts to fetch or generate thumbnails (ms) by long-running cache; strictly positive. We have this in order to avoid pestering a master server unnecessarily or wasting CPU cycles attempting to build a thumbnail.

A value of several times the allowed system latency/skew up to of the order of a day in the expectation of a daily exhibit-accession and thumbnail-build cycle on the server is probably reasonable.

We randomise the value so that different clients will not conflict with one another unduly.

We may wait longer than this when resource-constrained.

We may wait less than this when the cache is relatively young.


NORMAL_WAIT_BETWEEN_THUMBNAIL_REPEAT_FETCHES_MS

private static final long NORMAL_WAIT_BETWEEN_THUMBNAIL_REPEAT_FETCHES_MS
Normal wait time between attempts to fetch or generate thumbnails (ms) by long-running cache; strictly positive. We have this in order to avoid pestering a master server unnecessarily or wasting CPU cycles attempting to build a thumbnail; this is used where the upstream server doesn't seem too busy or in case of apparent transient network error.

A value of a few minutes is good for this purpose.

We randomise the value so that different clients will not conflict with one another unduly.

We may wait longer than this when resource-constrained.

We may wait less than this when the cache is relatively young.


_thumbnailsInMemory

private final MemoryTools.SoftReferenceMap<Name.ExhibitFull,java.lang.Object> _thumbnailsInMemory
Private in-memory cache of deserialised thumbnails; never null. This Map is guaranteed thread-safe and highly-concurrent.

Holding a lock on this object will not prevent updates to it.

This is a mapping:

If we run very low on space then discard the thumbnails and just keep the do-not-retry Long timestamps.


_timREFH

private final MemoryTools.RecurrentEmergencyFreeHandle _timREFH
If we run very low on space then discard the thumbnails and just keep the do-not-retry Long timestamps. This will free up the bulk of the memory but may continue to save lots of nugatory effort.

We have to hold a reference to the handle to prevent it expiring.


ALLOW_SYNC_TN_FETCH

private static final boolean ALLOW_SYNC_TN_FETCH
If true then allow missing thumbnails to be fetched synchronously at the risk of blocking for extened periods.

See Also:
Constant Field Values

consTime

private final long consTime
Time of construction.


_gTfirstTNBuildLock

private final java.util.concurrent.locks.ReentrantLock _gTfirstTNBuildLock
A lock to allow only one of concurrent thumbnail builds to assume unlimited resources. Other thumbnail may try with whatever memory (etc) remains, but at most one gets special privileges.


_gTFP_lock

private final java.util.concurrent.locks.ReentrantLock _gTFP_lock
Private lock for _getThumbnailsFromPeer() to prevent concurrent P2P thumbnail fetches; never null.


varMgr

private final PipelineVarMgr varMgr
Manages our local cache of variables, etc; never null. We set this up to be:


VAR_CACHE_HOLD_TIME_MS

private static final int VAR_CACHE_HOLD_TIME_MS
Variable flush/retrieval interval (ms); strictly positive. Based on the allowed distribution latency as centrally defined.

We randomise this a little to help avoid many slaves bothering the master simultaneously.


_handleSysVars_lastFetch

private transient volatile java.lang.Long _handleSysVars_lastFetch
Last time we flushed/fetched variables, private to _handleSysVars(). We do another flush/fetch when we find this to be null (eg initially) or more than VAR_CACHE_HOLD_TIME_MS in the past.

This is volatile so that we do not need to hold a lock to access it.


_handleSysVars_evSave

private transient volatile long _handleSysVars_evSave
Last time we saved (any) event histories, private to _handleSysVars(), initially zero. This is volatile so that we do not need to hold a lock to access it.


_aggressive

private volatile boolean _aggressive
Flag to adjust the aggressiveness of the cache; by default not aggressive. Aggressive cacheing may include read-ahead, and fetching exhibits or at least some leading portion of them to keep the cache full or at least primed with exhibits to improve the user experience.

This can be set false when the system is overloaded to eliminate most effort not strictly necessary.

Volatile to eliminate the need for locking.


genProps

private volatile GenProps genProps
Our record of the current GenProps; never null. Maintained by poll() under the instance lock.

Is volatile so can be accessed without a lock.


LOW_WATER_FRACTION

private static final float LOW_WATER_FRACTION
Fraction of max cache size that is the low-water mark. In the range ]0.0f, 0.1f[ excluding both end points.

We will only do precacheing when the cache size is below the low-water mark.

Don't get this too close to 1 to avoid churning the cache when loading large single blocks of data or upon other minor disturbances.

See Also:
Constant Field Values

CALC_MISSING_EPCM_DURING_PRECACHE

private static final boolean CALC_MISSING_EPCM_DURING_PRECACHE
If true then try to at least partially compute EPCM while precacheing. To do this fully we would need access to Scorers, etc, which may simply not be available in this cacheing layer, so at most we can do a fast approximation for EPCM values completely absent.

See Also:
Constant Field Values

ALLOW_DATA_FETCH_FROM_PEERS

private static final boolean ALLOW_DATA_FETCH_FROM_PEERS
If true, allow us to try fetching exhibit data from peers rather than master.

See Also:
Constant Field Values

ALLOW_TN_FETCH_FROM_PEERS

private static final boolean ALLOW_TN_FETCH_FROM_PEERS
If true, allow us to try fetching thumbnails from peers rather than only the master. Since all mirrors/peers should generally cache all thumbnails indefinitely then this should not incur any significant extra traffic even in poor circumstances.

This may allow lateral spread of a thumbnail once any mirror has managed to create one even when the master is unable (eg due to resource restrictions) to make one.

See Also:
Constant Field Values

altDataSourceRating

private final java.util.Map<java.lang.String,java.lang.Long> altDataSourceRating
Thread-safe Map from mirror ID to strictly-positive rating with "" for master; never null. The rating is a synthetic (milliseconds) time to fetch a big data block (or thumbnails). Lower values represent peers that have the data available quickly more of the time.

All values are strictly positive.

This is periodically purged of stale data (ie inactive peers) to keep it from growing without bound.


PEER_SELECTION_CAUTIOUS

private static final boolean PEER_SELECTION_CAUTIOUS
If true then use a cautious strategy to select a peer to talk to.

See Also:
Constant Field Values

PEER_STATS_UNKNOWN_MS

private static final int PEER_STATS_UNKNOWN_MS
Default rating/time (ms) for "unknown" data source/mirror/peer; strictly positive. The value chose corresponds to 10000ms RTT and 1kBps throughput, ie much slower than most reasonable peers.


PEER_STATS_TC

private static final int PEER_STATS_TC
Time-constant for updating peer fetch time value; strictly positive. Higher values means a lower-pass filter, and more robustness in the face of temporary glitches.

A value that is a power of two may result in more efficient code.

A value from 8 to 256 is probably reasonable.

See Also:
Constant Field Values

MASTER_FAKE_TAG

private static final java.lang.String MASTER_FAKE_TAG
Fake tag we use to indicate a fetch from the master/upstream via the pipeline.

See Also:
Constant Field Values

P2P_RND_FRAC

private final int P2P_RND_FRAC
Fraction of the time to pick a peer completely at random; strictly positive.


P2P_NEXT_BEST_FRAC

private final int P2P_NEXT_BEST_FRAC
Fraction of the time to choose a 2nd-tier peer rather than the best peer; strictly positive.


NEARLY_TOP_FACTOR

private static final int NEARLY_TOP_FACTOR
Factor/multiplier of peers worse than top that will be considered for fetches routinely; strictly positive.

See Also:
Constant Field Values

PREFER_PEERS_TO_MASTER_WHERE_POSSIBLE

private static final boolean PREFER_PEERS_TO_MASTER_WHERE_POSSIBLE
If true, avoid use of master where peers are available.

See Also:
Constant Field Values

_preCacheLock

private final java.util.concurrent.locks.ReentrantLock _preCacheLock
Precache lock to prevent multi-threaded precache runs.


_noMorePrecacheUntil

private transient volatile long _noMorePrecacheUntil
Time before which we will not do more precacheing. Private to _doPreCache().

Initial value of 0 allows precaching to start immediately.


_userRequestedDataFromCache

private transient volatile boolean _userRequestedDataFromCache
Set true when a user requests data from the cache. This is as a result of user activity, and without this being true we still won't indulge in any precacheing. This means that we can safely have more than one context set up (for example), in a servlet runner as long as only one is ever actually used.

Accessed without locking; read by _doPreCache() and may be set by a routine that is sure that it has received a user request to fetch exhibit/thumbnail data.

We can reset this if we believe we have finished precaching (or at least a reasonable chunk of precacheing work) for the current exhibit set.


_precacheIterator

private transient java.util.Iterator<Name.ExhibitFull> _precacheIterator
An iterator over a snapshot of all exhibit names. This is initially null, and when null or when exhausted it is reset to be a new snapshot of the exhibit names. This avoids starvation of some exhibits.

We access this only from _doPreCache() which is single-threaded, so this need not be thread safe.

We may order the iteration in some way as to try to precache as efficiently as possible, eg smallest or `best' first, or we might store the exhibits in, for example, a shuffled order.

When we get a name from this iterator we must make sure that it still represents a valid exhibit, since the exhibit might have been deleted, for example.

Accessed only by _doPreCache().


_precacheExhibitHash

private transient java.lang.Long _precacheExhibitHash
Indicator for which image set we are working on. When we start a new round of precaching we set this to the hash of the current exhibit set.

If we come across an exhibit that we do some precaching work on, we set this to null.

When we are about to start a new round of precaching and discover this is set to the hash of the current exhibit set, we assume that there was no work to be done and we skip precaching. When the exhibit set changes we will then resume.

Accessed under the rwl by _doPreCache().


MAX_dPC_SPIN_TIME_MS

private final int MAX_dPC_SPIN_TIME_MS
Maximum time that _doPreCache() can spend in one go (ms). Designed to be short enough to avoid causing massively irritating interruptions to user interactivity if we lock other activity out for this long, though long enough to be relatively efficient if possible.

Precacheing will not generally interfere with interactive operations so we try to make this time large enough to allow the fetch of a block or three of exhibit data over a slow Net link, allowing for RTT and connection setup and bandwidth, etc.

Something of the order of a few seconds may be good.

We radically reduce this for CPU-sensitive (eg cloud) environments.


MAX_dPC_BACKOFF_TIME

private final int MAX_dPC_BACKOFF_TIME
Max time _doPreCache() has to sleep for (ms). This is basically if some freak event happens beyond _doPreCache()'s reasonable control.

A few minutes is probably reasonable.


upstreamStratum

private volatile Stratum upstreamStratum
Our stratum cached; never null though may be UNKNOWN. We may examine the low-power flag to decide to reduce upstream access.

Is marked volatile for thread-safe lock-free access.

Updates piggybacked on variable set/fetch work.


destroyed

private volatile boolean destroyed
Set true once destroy() is called; never set false again.

Constructor Detail

ExhibitDataSimpleCache

private ExhibitDataSimpleCache(SimpleExhibitPipelineIF dataSource,
                               java.io.File cacheDir,
                               SimpleLoggerIF logger)
                        throws java.io.IOException
Wrap a new cache instance around a data source. This is private so that we can enforce a singleton pattern and avoid multiple simultaneous users of the underlying file-based cache.

We try to load the cache meta-data and exhibit properties from persisted copies. We can survive without the exhibit properties, but if we can't load our meta data the default value we use is read-only so that we don't trust it until it's been checked against disc, presumably in the background.

Throws:
java.io.IOException - if cache directory does not exist and/or cannot be created (the containing directory passed in must always exist)
Method Detail

getLiveCachedExhibitCount

public int getLiveCachedExhibitCount()
Return directly the number of partly- or fully- cached exhibits; never negative. This may be more than the number of exhibits, for example before deleted/renamed exhibits are removed.


cacheFactory

public static ExhibitDataSimpleCache cacheFactory(SimpleExhibitPipelineIF dataSource,
                                                  java.io.File cacheDir,
                                                  SimpleLoggerIF logger)
                                           throws java.io.IOException,
                                                  java.lang.IllegalStateException
Get the an instance copy of this class; may be a singleton. If operating as a singleton then this creates an instance on the first call; all subsequent requests/calls are vetoed (at least in this servlet context and thus namespace) unless the cacheDir matches that for the extant instance in which case the new dataSource is ignored and the extant cache instance is returned.

If the upstream source is an ExhibitDataFileSource then this instance may assume that data access from the ExhibitDataFileSource is only slightly more expensive than accessing its own local cache (accessing the file source may involve powering-up bulk storage). This will typically be the case on the master for example.

Throws:
java.lang.IllegalStateException - if this is a singleton, and a request to create with a different cache dir to an extant instance is made
java.io.IOException - if cache directory does not exist and/or cannot be created (the containing directory passed in must always exist)

rmCache

public static void rmCache(java.io.File cacheDir)
                    throws java.io.IOException
Remove a persistent cache. Pass in the same _cacheDir as for a call to cacheFactory(), but this must not be called if a cache instance is using the cache.

Will not remove entries in cacheDir unrelated to the cache.

Do not use this lightly; it may discard gigabytes of useful state.

Throws:
java.io.IOException - in case of I/O error

_getWriteLock

private static void _getWriteLock(java.util.concurrent.locks.ReentrantReadWriteLock rwl,
                                  java.lang.String detail,
                                  SimpleLoggerIF logger)
                           throws java.io.InterruptedIOException
Get the cache write lock, complaining/aborting if we have to wait for a long time.

Parameters:
rwl - cache lock; never null
Throws:
java.io.InterruptedIOException - if the thread is interrupted or locking is aborted

_getReadLock

private static void _getReadLock(java.util.concurrent.locks.ReentrantReadWriteLock rwl,
                                 java.lang.String detail,
                                 SimpleLoggerIF logger)
                          throws java.io.InterruptedIOException
Get a cache read lock, complaining/aborting if we have to wait for a long time. We complain sooner waiting for a read lock rather than the write lock, since read locks are expected to be easier/quicker to obtain.

Parameters:
rwl - cache lock; never null
Throws:
java.io.InterruptedIOException - if the thread is interrupted or locking is aborted

getRawFile

public void getRawFile(java.nio.ByteBuffer buf,
                       Name.ExhibitFull exhibitName,
                       int position,
                       boolean dontCache)
                throws java.io.IOException
Get a chunk of the raw exhibit binary. The call may return less than the the buffer capacity, though will block until it has read at least one byte unless at EOF or for a zero-byte request; this will be clear from the state of the buffer.

The name, start byte offset/position and a buffer to fill are supplied.

Specified by:
getRawFile in interface SimpleExhibitPipelineIF
Parameters:
position - must be non-negative and less than the exhibit size in bytes
dontCache - if true do not cache locally, unless we have lots of free space
buf - the buffer into which to read the data; must be non-null, in put()able state, and with remaining capacity of at least the requested number of bytes
exhibitName - the full name of the exhibit to read from; never null and must be syntactically valid
Throws:
java.io.IOException - for requests that cannot be fulfilled because of I/O restrictions or problems, such as link failure or an upper bound on the length of a request

_getMaximumCacheableBytesForOneExhibit

static int _getMaximumCacheableBytesForOneExhibit(GenProps gp)
Computes the maximum number of bytes to cache from (the start of) any one exhibit; strictly positive. Ensures that no one exhibit can monopolise the entire cache, but also that at least a small chunk of the start of any exhibit is logically permitted,

No one exhibit is allowed to grow to more than a few percent of the cache space, though this limit may only be checked at each point that an exhibit might be extended in cache.


getStaticAttr

public final ExhibitStaticAttr getStaticAttr(Name.ExhibitFull name)
Get the static attributes for a given exhibit; null if no such exhibit. We get this from our cache of the immutable data rather than going to the source directly. We don't block or hold any locks to fetch this.

Returns null if the named exhibit does not exist.

Specified by:
getStaticAttr in interface SimpleExhibitPipelineIF

getAllExhibitImmutableData

public final AllExhibitImmutableData getAllExhibitImmutableData(long oldStamp)
Gets all static exhibit data if its timestamp is not that specified. If the time specified is negative the object will be returned unconditionally.

If no exhibits are currently installed then a default set with a zero timestamp is returned.

If the caller's copy appears to be up-to-date (eg the oldStamp matches that that would have been returned) null is returned.

We get this from our cache of the immutable data rather than going to the source directly. We don't block or hold any locks to fetch this.

Specified by:
getAllExhibitImmutableData in interface SimpleExhibitPipelineIF

getAllExhibitProperties

public AllExhibitProperties getAllExhibitProperties(long oldHash)
Gets set of all exhibit properties if its hash is not that specified. If the hash specified is negative the object will be returned unconditionally.

If no exhibits are currently installed a default set with a zero timestamp is returned.

If the caller's copy appears to be up-to-date (eg the oldHash matches that that would have been returned) null is returned.

We get this from our cache rather than going to the source directly. We don't block or hold any locks to fetch this.

Specified by:
getAllExhibitProperties in interface SimpleExhibitPipelineIF

_getAllExhibitProperties

private void _getAllExhibitProperties()
Attempts to get all exhibit properties if our cached copy may be stale. Because fetching/computing this value can take a very long time (upwards of several tens of minutes) we attempt to split the activity into two parts, and have the actual computation/fetch done in the background, and then an atomic post of the results back to the cache proper.

We also adaptively attempt to use an AEP diff fetch if one is available (ie if the underlying connection is a tunnel).


_getAllExhibitProperties_postUpdate

private void _getAllExhibitProperties_postUpdate(AllExhibitProperties new_AEP,
                                                 boolean inCons)
                                          throws java.io.IOException
Accepts a (new) AEP value posted from a background thread. May be called at initialisation to reload cached state, and when a poll of the upstream source returns a new AEP.

Must grab a write lock to (potentially) update/change the cache.

If the static exhibit data was stale then we also clear our in-memory raw-exhibit data cache entirely, to be refilled by slower means.

Parameters:
inCons - if true then this was called from the constructor so we don't save the AEP nor do some other expensive things that may rely on external mechanisms not yet set up
Throws:
java.io.IOException

_save_AEP

private void _save_AEP()
                throws java.io.IOException
Save AllExhibitProperties to disc. We should do this when we receive a new set from downstream so that we can restart with the appropriate set, and periodically to save any cached state that we have accumulated.

In principle this needs a write lock to alter state on disc. In practice at most one thread at once will ever try to call this, and the serialiseToFile() routine attempts to atomically replace the file, and this can take a long time and thus needlessly block cache activity, so we do not take a cache lock here.

Throws:
java.io.IOException

_checkMetaData

private void _checkMetaData()
Initiates, in the background, a check of the in-memory cache meta data against disc. Grabs a write lock on the cache and makes the cache read-only while it works.

Refuses to do the check if it is too soon since the last one or if the cache seems busy.

May postpone a check if the system is short of power or otherwise stressed.


_cleanAndSaveMetaData

private void _cleanAndSaveMetaData(boolean force)
Saves the cache metadata if needed. Grabs a write lock to update disc (and memory) state.

Aims to avoid saving the metaData more than once every METADATA_MIN_SAVE_INTERVAL_MS, though if no save has taken place for a while then the next save will happen on the next call.

If saving the meta-data is taking a long time this aims to postpone the next save at least a reasonable multiple of that to avoid wasting too much system/CPU/disc bandwidth, though we do put a cap on the maximum delay in case of weirdness...

This may also incrementally purge stale meta-data and data just before the save to avoid the need for an extra meta-data save to account for the purge-induced changes themselves.

This may also pick one or more exhibits at random to spot-check for consistency with the master copy (eg looking for data corruption).

Parameters:
force - if true, force an immediate save to disc before return, else run asynchronously (if possible, else discard) and never block

_incrCheckMRUExhibitEntries

private void _incrCheckMRUExhibitEntries(java.util.Set<Name.ExhibitFull> done)
                                  throws java.io.IOException
Incrementally check cached exhibits for integrity. This will attempt to remove any entry it finds that is corrupt.

This concentrates on the most-recently-used cache entries as data corruption in these would probably be the most serious, though may also attempt to systematically scan all cache entries.

This may examine any cached entry.

This may not examine any entry at all if the cache seems to be busy.

This will stop after removing at most one corrupt entry.

Parameters:
done - if not null then this routine adds to this Set the full name any exhibit that it checks (just before checking) and avoid checking any exhibit in this Set; this need not be thread-safe for one unshared instance
Throws:
java.io.IOException

_incrPurgeOrphanedExhibits

private void _incrPurgeOrphanedExhibits()
                                 throws java.io.IOException
Do incremental purge of orphaned cache entries conditions are right. Tries to grab a write lock to do its work; if it can't get one immediately (ie the cache is busy) then it returns immediately.

Doesn't attempt any purging if there is an empty exhibit set or if there is no cache size currently set since this cache may not even be properly initialised yet...

We clear at most one orphaned entry on each call.

Throws:
java.io.IOException

getGenProps

public GenProps getGenProps(long oldStamp)
Gets the general properties as a GenProps object if its timestamp is not that specified. If the time specified is negative the object will be returned unconditionally.

If no fresh props have yet been fetched then a default set with a zero timestamp is returned.

If the caller's copy appears to be up-to-date (eg the oldStamp matches that that we would have been returned) null is returned.

We get this from our cache of the immutable data rather than going to the source directly. So we don't block or grab any lock to fetch this value.

We do not attempt to persist this data since carrying old GenProps values across a restart may be a very poor idea.

Specified by:
getGenProps in interface SimpleExhibitPipelineIF

_getGenProps

private void _getGenProps()
                   throws java.io.IOException
Attempts to get sysprops if our cached copy may be stale. Slightly strange is that we use our cached sys props value to determine the frequency at which we recheck the cache; the default value is short so we should initially poll quickly until we get a kosher value.

A special case here: if we have a GenProps object with a non-zero timestamp (presumably pulled over from a running master) and then we get one with a zero timestamp, we ignore the new, zero, instance since it probably means that the master has just been restarted and has not yet loaded new GenProps.

This does not need to hold any locks since all the values it touches are volatile.

Throws:
java.io.IOException

getGenSecProps

public java.util.Properties getGenSecProps(long oldStamp)
Gets the security properties as a Properties object if its timestamp is not that specified. If the time specified is negative the object will be returned unconditionally.

If no props are currently installed/available a default set with a zero timestamp is returned.

If the caller's copy appears to be up-to-date (eg the oldStamp matches that that would have been returned) null is returned.

We get this from our cache of the immutable data rather than going to the source directly. We don't block or grab any locks to fetch this.

We do not attempt to persist this data since carrying old values across a restart may be a very poor idea.

We wrap this as the defaults to a new Properties object to protect our copy against accidental alteration.

Specified by:
getGenSecProps in interface SimpleExhibitPipelineIF

_getGenSecProps

private void _getGenSecProps()
                      throws java.io.IOException
Attempts to get gensecprops if our cached copy may be stale. This does not need any locks since the state is mainatined in volatile values.

Throws:
java.io.IOException

getThumbnails

public ExhibitThumbnails getThumbnails(Name.ExhibitFull name,
                                       boolean create)
Gets the thumbnails for an exhibit. A data source is at liberty to refuse to compute thumbnails in which case it may return null, else it returns a non-null value which may include the `could-not-compute' value to indicate that a thumbnail/sample can never be made for this exhibit and no attempt need be made again.

This retains a private in-memory cache of deserialised thumbnails held by SoftReference, and it tries first to recover them from there.

This tries to retrieve thumbnails from the cache, and returns them if they are there.

Else, if create is true, this tries to create the thumbnails, cache them, and return the value. But we won't bother unless the main image is fully loaded.

Note that only the read and write of cache is done under lock; the thumbnail generation is unlocked and concurrency is restricted, if at all, by the handler routine(s).

Partly because this routine is called by our own precache routines, we do not regard reading a thumbnail as proving user access to the cache (exhibit data has to be read for that).

Specified by:
getThumbnails in interface SimpleExhibitPipelineIF
Parameters:
create - if true, and no thumbnail yet exists, try to create one if possible; else if create is false only return an existing one and return null if none is to hand (or possibly allow fetch of pre-built remote one)

_getThumbnails

private ExhibitThumbnails _getThumbnails(Name.ExhibitFull name,
                                         boolean create,
                                         boolean allowSyncFetch)
Get the thumbnails for an exhibit; null if not available. A data source is at liberty to refuse to compute thumbnails in which case it may return null, else it returns a non-null value which may include the `could-not-compute' value to indicate that a thumbnail/sample cannot be made for this exhibit and no attempt need be made in future.

This retains a private in-memory cache of deserialised thumbnails held by SoftReference, and it tries first to recover them from there. This is very important to fast delivery of thumbnails for building pages referencing many thumbnails.

This tries to retrieve thumbnails from the cache, and returns them if they are there.

Else, if create is true, this tries to create the thumbnails, cache them, and return the value. But we won't bother unless the main image is fully loaded.

Note that only the read and write of our tn cache is done under lock; the thumbnail generation is unlocked and concurrency is restricted, if at all, by the handler routine(s).

Partly because this routine is called by our own precache routines, we do not regard reading a thumbnail as proving user access to the cache (exhibit data has to be read for that).

Parameters:
create - if true, and no thumbnail yet exists, try to create one if possible; else if create is false only return an existing one and return null if none is to hand (or possibly allow fetch of pre-built remote one)
allowSyncFetch - if they then allow a synchronous fetch from upstream
Returns:
null if no such exhibit or a transient problem, NO_THUMBNAILS if this exhibit type can never have thumbnails or it appears impossible for this particular exhibit, or else a non-null non-NO_THUMBNAILS value

_instanceLifems

private long _instanceLifems()
How long this instance has been alive in milliseconds. Will be non-negative when system clock is monotonic.


upstreamSourceIsLocal

private boolean upstreamSourceIsLocal()
Returns true iff upstream is local disc so some operations should be cheap without cacheing. Can avoid significant redundant effort.


_getThumbnailsFromPeer

private ExhibitThumbnails _getThumbnailsFromPeer(Name.ExhibitFull name)
                                          throws java.io.IOException
Attempt to fetch the specified thumbnails from any peer; may be null if currently unavailable. This never attempts to force creation of a thumbnail remotely, but rather tries to fetch an already-present value.

This updates the P2P stats as if an exhibit-data-block transfer, which is reasonable since this only attempts a fetch of data, never a create which may take signifiant remote CPU time. Note that only an (IO)Exception (not having null returned) is 'failure'.

This may potentially 'loop' between peers consuming resources uselessly unless some other mechanism is used to prevent such behaviour. However, only one outgoing P2P thumbnail request is allowed at once here, which should limit any such problem and resources consumed by it.

Parameters:
name - full exhibit name; never null
Throws:
java.io.IOException

_asyncTNFetch

private void _asyncTNFetch(Name.ExhibitFull exhibitName)
Attempt to asynchronously fetch/create thumbnail that we have failed to return to the user. We so this to attempt to fetch soon any thumbnail that was recently requested but was not immediately available on the grounds that it may be needed again soon.

This only uses a strictly limited number of threads, but avoids a wait for general precacheing, which may never happen.

Parameters:
exhibitName - full exhibit name; non-null valid exhibit name

_handleSysVars

private void _handleSysVars(boolean force)
                     throws java.io.IOException
Handle (update, sync, persist) system variables as required. Recompute any local values generated by the cache itself, flush any outbound values, and retrieve any upstream values periodically.

We recompute local variables when we would be prepared to flush/fetch system variables.

Timing is handled with volatile values, so we do not need to take out any other locks while working.

We rely on the locking within varMgr to ensure consistency, including during the save. Bad things may happen if trying to remove a cache while we are trying to save event histories!

We also use this to recompute any vote/correlation factors and update our notion of this instance's stratum.

Parameters:
force - if true, force an immediate complete save of state upstream, and to disc (if upstream source not already local)
Throws:
java.io.IOException

setVariable

public void setVariable(SimpleVariableValue newValue)
                 throws java.io.IOException
Set variable. Set local cached value immediately; store global values to periodically propagate upstream to master but show last global values obtained from master on periodic poll.

Specified by:
setVariable in interface BasicVarMgrInterface
Throws:
java.lang.IllegalArgumentException - on attempt to: set variable with value of wrong type or incompatible definition, set non-existent or read-only variable (or these can be ignored)
java.io.IOException - in case of I/O difficulty

setVariables

public int setVariables(SimpleVariableValue[] newValues)
                 throws java.io.IOException
Update number of variables at once for efficiency. Is passed a Set of SimpleVariableValues and behaves as if it operates on all of them by calling setVariable() for each item in the Set.

This implementation "fails fast" on the first error.

This implementation never throws an IOException.

Specified by:
setVariables in interface BasicVarMgrInterface
Returns:
the number of variable values set (the length of the array); never negative, never more than the number passed in
Throws:
java.lang.IllegalArgumentException - on attempt to: set variable with value of wrong type or incompatible definition, set non-existent or read-only variable (or these can be ignored)
java.io.IOException

getVariable

public SimpleVariableValue getVariable(SimpleVariableDefinition var)
Get a single variable value; returns null if no such value or wrong type. Always get from local cache.

This implementation never throws an IOException.

Specified by:
getVariable in interface BasicVarMgrInterface
Parameters:
var - definition of variable to fetch; never null

getVariables

public SimpleVariableValue[] getVariables(long changedSince)
Get immutable Set of variable values altered on or after a given time, or all for -1. Always get from local cache (the variable cache being periodically updated from the master).

This may be slow if there are many live variables.

This implementation never throws an IOException.

Specified by:
getVariables in interface BasicVarMgrInterface

getEventValue

public EventVariableValue getEventValue(SimpleVariableDefinition def,
                                        EventPeriod intervalSelector,
                                        boolean current)
Get the current partial, or previous full, event set at the specified interval; never returns null. This is a simplified interface to return either the current event set that is being collected, or the previous completed set.

The current set is the most timely, but may not contain enough data to be meaningful if the new interval has just started.

The previous set is complete and thus most likely to have enough samples to be useful, but is not completely current.

If the requested event set is not (immediately) available, an empty synthetic one is created and returned. Thus, with this interface, it is not possible to distinguish between there being no events in the given interval or simply no data.

TODO: This attempts to limit the amount of time that may be spent blocking, eg due to upstream I/O issues, but its ability to do so may depend on availability of threads, etc.

Specified by:
getEventValue in interface BasicVarMgrInterface
Parameters:
def - event definition (must be for an event); never null
intervalSelector - never null
current - if true the current event set is returned, else the previous complete set is returned
Returns:
requested event set; never null
Throws:
java.lang.IllegalArgumentException - if the request arguments are invalid

getEventValues

public EventVariableValue[] getEventValues(SimpleVariableDefinition def,
                                           EventPeriod intervalSelector,
                                           long intervalNumber,
                                           java.util.BitSet whichValues)
Get the specified event sets for the specified intervals; never null. This allows retrieval of zero or more event sets for the specified interval size.

Requests for more than SystemVariables.EVENT_SAMPLES_RETAINED in the past (or for the future!) cannot be satisfied and data will not be returned for them.

Usually not more than SystemVariables.EVENT_SAMPLES_RETAINED samples will be returned in response to any one request as a safety measure.

(An implementation that is not an end-point may go upstream to fetch missing values and cache them to satisfy future requests.)

Specified by:
getEventValues in interface BasicVarMgrInterface
Parameters:
def - event definition (must be for an event); never null
intervalSelector - never null
intervalNumber - a time (as from System.currentTimeMillis()) which identifies the first interval for which data is potentially required; if too far in the past or future then possibly no data will be available, zero is used to select the "all" bucket
whichValues - each true bit represents a slot for which data is required, bit 0 indicating data from the slot within which firstIntervalTime is located, bit 1 the previous slot, etc
Returns:
as many of the requested values as available, at least long enough to return all the available values, with [0] corresponding to bit 0 in the BitSet; may contain nulls or be zero-length but is never null

syncVariables

public void syncVariables(boolean force)
                   throws java.io.IOException
Synchronise with upstream values. Pushes updated values upstream to the source, calls sync on the source with the same "force" argument, and then retrieves changed values from upstream.

Holds no externally-visible locks, but if called by multiple threads this will serialise the calls.

Specified by:
syncVariables in interface SimpleVariablePipelineIF
Parameters:
force - if true, this will force a full sync on the read side by using getVariables(-1) rather than attempting to choose a nearer timestamp for efficiency; the implementation is at liberty to use getVariables(-1) at any time whatever the argument value, and almost certainly should use it on the first call
Throws:
java.io.IOException - if one is received from upstream

setAggressive

public void setAggressive(boolean isAggressive)
Set the aggressiveness of the cache; by default not aggressive.


getProperties

public java.util.Properties getProperties(SimpleExhibitPipelineIF.PropsKey key,
                                          long versionID)
                                   throws java.io.IOException
Get requested Properties selected by key and versionID. Fetches a Properties set unconditionally (versionID == -1) else if the versionID presented is not current.

Specified by:
getProperties in interface SimpleExhibitPipelineIF
Parameters:
key - selector (with possible embedded sub-key) for desired properties set; never null
versionID - if -1 then map is always returned if available, else must be non-negative and null is returned if the versionID presented matches that of the current version (ie if the caller has presumably got the up-to-date version); may be a timestamp or a hash or other value, and by convention is zero only for an empty properties set
Returns:
null, or Properties map guaranteed to contain only String keys and values
Throws:
java.io.IOException

poll

public void poll(GenProps _gp)
          throws java.io.IOException
Poll periodically (of the order of a second) to do cache maintenance. We keep the poll up to date to keep the work out of a servlet response; the data retrieval might easily take a long time...

This routine takes care of calling the upstream poll().

We have to be careful about not restricting servlet callers' concurrency here... We try not to do fetches from the back-end, which may be very slow, with the instance lock, which would shut out all foreground users needlessly.

We ignore the caller's GenProps and fetch and cache our own...

Specified by:
poll in interface SimpleExhibitPipelineIF
Throws:
java.io.IOException - in case of difficulty, but even if a sub-ordinate call throws IOException then poll() call should continue to do as much of its remaining work as reasonably possible

_doPreCache

private void _doPreCache(GenProps gp)
Routine to do incremental pre-cacheing. Exits as soon as it detects other threads queueing for the cache lock.

This relies on the following items being fetched and entirely maintained by other activity/methods, probably under poll():

This will only do any data prefetching while the cache is below the low-water mark, and will limit the amount of client and master resource used.

We won't actually start any precacheing until we see some evidence of user/upstream activity that might eventually benefit from it.

This will also only precache while the "aggressive" flag is set, and this should be set true only when the system is not busy.

This may also involve precomputation/preloading of optional data, though should not be relied in lieu of other methods to keep this fresh, so we may have work to do even if there is no space for prefetching.

This also incrementally checks the cache for consistency with the current exhibit properties, ie timestamp, size, hashes.

When running as a cloud instance, with bandwidth and CPU metered/charged, we may resist precaching exhibit data and all but the most popular thumbnails.

This does not hold a cache lock for its duration, but does hold a private lock to protect its internal state because it must not be multi-threaded; any attempt to run this in a second thread is quietly vetoed.


_doCacheDataValidityTest

private final void _doCacheDataValidityTest(AllExhibitProperties aep,
                                            ExhibitStaticAttr esa)
                                     throws java.io.IOException
Partly check the cache data (including metadata, tns, etc) for validity. This picks one or more aspects (at random) of the currently cached data for the specified exhibit for validity.

Checks include length, timestamp, and hashes dependent on the data available.

This is designed to complete reasonably quickly in most cases, to perform an incremental check, removing or in some cases repairing damaged/corrupt/invalid data.

If this finds the cache entry to be broken somehow then this routine may delete the cache entry entirely, or repair the data, or prune back to some valid prefix of the data held, or just remove the corrupt underlying data to leave the metadata to be fixed at a later date.

This will grab locks only as it needs them in order to be as unintrusive as possible.

This is not expected to be needed very often, but is mainly designed to avoid silent disc corruption, and to provide for automatic repair.

Parameters:
aep - the current exhibit properties; never null
esa - the exhibit whose cache data is to be verified; never null
Throws:
java.io.IOException

_removeCorruptData

private void _removeCorruptData(ExhibitDataSimpleCache.CachedFile cf)
                         throws java.io.IOException
Remove cached exhibit data identified as corrupt. If possible then this clears up the metadata and raw data, etc, but if not then this just removes the suspect raw data, thumbnails, etc, from the disc cache.

The metadata will have to get fixed later.

Since this will have to grab a cache write lock and then the metadata lock if it is to remove the corrupt data, the caller must not have a read lock in place for example.

Parameters:
cf - exhibit cache metadata; never null
Throws:
java.io.IOException

_updateOneExhibit

private boolean _updateOneExhibit(ExhibitStaticAttr esa,
                                  GenProps gp,
                                  AllExhibitProperties aep,
                                  boolean forceDataFetch)
                           throws java.io.IOException
Update one exhibit incrementally during precaching. Must not be called within a lock; will grab any locks it needs.

Parameters:
forceDataFetch - if true, strongly encourage data fetch, ie extension of exhibit data if at all possible
Returns:
true iff this exhibit is not completely cached, ie may benefit from another call to this routine immediately
Throws:
java.io.IOException

_getExhibitDataFromUpstreamToPrecache

private void _getExhibitDataFromUpstreamToPrecache(ExhibitStaticAttr esa,
                                                   AllExhibitImmutableData aeid,
                                                   GenProps gp,
                                                   long start,
                                                   int len,
                                                   boolean forceIt)
                                            throws java.io.IOException
Try to extend cached data for the specified exhibit. By default we try to get this from the master, but we may try to fetch it from a peer/mirror instead (P2P) to reduce the load on the master.

Also, this may be used to attempt data/error recovery if data cannot be fetched from the master for some reason.

Parameters:
start - byte offset in exhibit to start reading/fetching for; non-negative
len - number of bytes to read; strictly positive.
forceIt - if true then we try very hard to get the data from a peer, for example to help with master/server error recovery
Throws:
java.io.IOException

_pickPeer

private java.lang.String _pickPeer(java.util.Set<java.lang.String> activeMirrors)
Pick one of the supplied peers to attempt to fetch exhibit data from; never null. Usually this returns the "best" (fastest) peer, but sometimes this will return an apparently sub-optimal peer so as to: In particular we are allowing for peer/server load and inter-peer network conditions to change continually.

Occasionally this will purge the cached peer stats of anything not in the argument set, with constant amortised cost per call.

Parameters:
activeMirrors - set of mirror tags (and possible MASTER_FAKE_TAG ("") for the master); never null, never empty
Returns:
selected peer, MASTER_FAKE_TAG for master; never null

_pickPeer

private java.lang.String _pickPeer()
Pick a peer to attempt to fetch exhibit data from; never null. Usually this returns the "best" peer; sometimes this will return an apparently sub-optimal peer so as to: In particular we are allowing for peer/server load and inter-peer network conditions to change continually.

Occasionally this will purge the cached peer stats of anything not in the argument set.

The master should by preference avoid fetching data from a peer to avoid contaminating the master copy with bad data from any peer. However, if it has a mirror tag, then it may fetch data P2P.

Returns:
selected peer, MASTER_FAKE_TAG for master (ie no peer); never null

_updatePeerStats

private void _updatePeerStats(java.lang.String peer,
                              boolean fetchSuccessful,
                              long timeTaken)
Update data-transfer stats for the given peer. A failed fetch is treated as a very slow access, so that a failing peer compares numerically unfavourably with reliable peers.


isPopularDownload

private boolean isPopularDownload(ExhibitStaticAttr esa)
                           throws java.io.IOException
In the top-N (global) downloads recently, with at least 2 downloads. The "more than one download" filter is to trim off a noisy "long tail" (eg on a quiet day).

This uses a combination of hits from the previous and current days.

Parameters:
esa - the exhibit details; never null
Returns:
true if this exhibit is a "popular" download.
Throws:
java.io.IOException

getStratum

public Stratum getStratum()
Return cached stratum; never null. Never throws an exception.

Specified by:
getStratum in interface SimpleExhibitPipelineIF

destroy

public void destroy()
Shut down the data pipeline. Flush state, variables and logs upstream and to disc as appropriate, and then make sure that upstream of us is destroyed.

Specified by:
destroy in interface SimpleExhibitPipelineIF

DHD Multimedia Gallery V1.57.21

Copyright (c) 1996-2011, Damon Hart-Davis. All rights reserved.