org.hd.d.pg2k.svrCore
Class ExhibitPropsComputableMutableVoteCache

java.lang.Object
  extended by org.hd.d.pg2k.svrCore.ExhibitPropsComputableMutableVoteCache
All Implemented Interfaces:
java.io.ObjectInputValidation, java.io.Serializable, ExhibitPropsComputableMutableVoteCacheIF

public final class ExhibitPropsComputableMutableVoteCache
extends java.lang.Object
implements ExhibitPropsComputableMutableVoteCacheIF, java.io.Serializable, java.io.ObjectInputValidation

Class to cache vote computations and correlated values. In particular this aims to avoid repeated expensive recomputation of underlying vote values so as global correlations can be efficiently computed.

See Also:
Serialized Form

Nested Class Summary
private static class ExhibitPropsComputableMutableVoteCache.Accum
          The class in which we accumulate stats while recomputing correlations.
static class ExhibitPropsComputableMutableVoteCache.CorrType
          Correlation types.
 
Field Summary
private static int _computeMaxConcurrentUpdateThreads
          Maximum number of threads to allow in update(); strictly positive.
private  java.util.concurrent.locks.ReentrantLock _uCoreLock
          Private lock to prevent more than one thread at once expending effort in the core part of update().
private  java.util.concurrent.Semaphore _uStartSem
          Counting semaphore to limit update() first-phase concurrency; never null.
private static boolean EAGER_VOTE_LOAD
          If true then try to load/compute early the scores of voted-for-exhibits.
private static char KEY_SEPARATOR_CHAR
          Key-separator character; not a valid exhibit-name character.
private static java.lang.String MARKER_KEY
          Special marker key value to indicate when we last computed correlations.
private static ExhibitPropsComputableMutable.Factor[] NO_CORRELATES
          Immutable empty correlates list.
private static long serialVersionUID
          Unique Serialisation class ID generated by http://random.hd.org/.
private static float SIGNIFICANCE_CONF_THRESHOLD
          Significance confidence threshold; non-negative.
private static int SIGNIFICANCE_COUNT_THRESHOLD
          Significance count threshold for non-sparse correlations; non-negative.
private  int UPDATE_TIME_LIMIT_MS
          Time limit for update() in ms; strictly positive.
private static boolean USE_ALL_BUCKET
          If true then try to use the "all" votes buckets rather than summing the other buckets.
private  java.util.Hashtable<java.lang.String,Tuple.Pair<java.lang.Long,ExhibitPropsComputableMutable.Factor>> voteCorrCacheMap
          Map from vote/correlation key to compute Factor and period for which it was computed; never null after construction/deserialisation.
 
Fields inherited from interface org.hd.d.pg2k.svrCore.ExhibitPropsComputableMutableVoteCacheIF
TRIVIAL
 
Constructor Summary
  ExhibitPropsComputableMutableVoteCache()
          Construct default empty cache.
private ExhibitPropsComputableMutableVoteCache(java.util.Hashtable<java.lang.String,Tuple.Pair<java.lang.Long,ExhibitPropsComputableMutable.Factor>> map)
          Reconstruct object, eg during deserialisation.
 
Method Summary
private static void _clearStaleState(java.util.Hashtable<java.lang.String,Tuple.Pair<java.lang.Long,ExhibitPropsComputableMutable.Factor>> map)
          Remove stale state from the map passed in.
private  void _findCorrelates(ExhibitStaticAttr esa, AllExhibitProperties aep, long currentPeriod, java.util.List<ExhibitPropsComputableMutable.Factor> result, boolean force)
          Append to the result List any factors we find for this exhibit.
 ExhibitPropsComputableMutable.Factor calcVoteFactor(java.lang.String exhibitName, AllExhibitProperties aep, BasicVarMgrInterface vars)
          Compute the vote or retrieve from cache; never null.
 ExhibitPropsComputableMutable.Factor[] getCorrelates(ExhibitStaticAttr esa, AllExhibitProperties aep, BasicVarMgrInterface vars, boolean force)
          Get correlates for specified exhibit; never null but result may be empty.
 java.lang.Boolean isCategoryGood(java.lang.String categoryDir, AllExhibitProperties aep, BasicVarMgrInterface vars, boolean force)
          Find out if a category is rated "good"/popular or not.
protected  java.lang.Object readResolve()
          Deserialise: use constructor for validation, defensive copying, etc.
 void update(AllExhibitProperties aep, BasicVarMgrInterface vars, boolean noTimeLimit)
          Bring correlations data up to date.
 void validateObject()
          Validates the object.
protected  java.lang.Object writeReplace()
          Serialise: strip out stale information before serialising.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

KEY_SEPARATOR_CHAR

private static final char KEY_SEPARATOR_CHAR
Key-separator character; not a valid exhibit-name character.

See Also:
Constant Field Values

voteCorrCacheMap

private final java.util.Hashtable<java.lang.String,Tuple.Pair<java.lang.Long,ExhibitPropsComputableMutable.Factor>> voteCorrCacheMap
Map from vote/correlation key to compute Factor and period for which it was computed; never null after construction/deserialisation. Since all our calculations depend on VLONG events, cache entries need only be invalidated when the period number changes.

One valid key format is a full, valid exhibit name.

Other keys are of the form of a CorrType name followed by a colon followed by the rest of the key.

This state can be serialised; it is also recomputed on demand.

We use a Hashtable to be thread-safe.


NO_CORRELATES

private static final ExhibitPropsComputableMutable.Factor[] NO_CORRELATES
Immutable empty correlates list.


SIGNIFICANCE_CONF_THRESHOLD

private static final float SIGNIFICANCE_CONF_THRESHOLD
Significance confidence threshold; non-negative. Computed correlations of confidence less than +/- this are discarded to save space.

A value between 0.001 (0.1%) and 0.1 (10%) is probably reasonable.

See Also:
Constant Field Values

SIGNIFICANCE_COUNT_THRESHOLD

private static final int SIGNIFICANCE_COUNT_THRESHOLD
Significance count threshold for non-sparse correlations; non-negative. Computed non-sparse correlations with counts less than this are discarded to save space and time.

A value between 2 and 10 is probably reasonable; an odd value avoids recording a "neutral" score at minimum count.

See Also:
Constant Field Values

UPDATE_TIME_LIMIT_MS

private final int UPDATE_TIME_LIMIT_MS
Time limit for update() in ms; strictly positive. A value of the order of a second or so is probably right. This allows some significant work to get done and cached, but should not block anything (such as page generation) for an unbearable amount of time even to a visitor.

This should still let us get a reasonable amount of work done so that we can rebuild the data incrementally if necessary.


_uCoreLock

private final java.util.concurrent.locks.ReentrantLock _uCoreLock
Private lock to prevent more than one thread at once expending effort in the core part of update().


_computeMaxConcurrentUpdateThreads

private static final int _computeMaxConcurrentUpdateThreads
Maximum number of threads to allow in update(); strictly positive. It is probably worth allowing into update() a few threads to try to parallelise probably-I/O-bound sections at the start and to do CPU-bound computations concurrently on multi-CPU systems, but too many concurrent threads will possibly just waste resources and overload I/O subsystems (eg back-end HTTP connections).


_uStartSem

private final java.util.concurrent.Semaphore _uStartSem
Counting semaphore to limit update() first-phase concurrency; never null.


EAGER_VOTE_LOAD

private static final boolean EAGER_VOTE_LOAD
If true then try to load/compute early the scores of voted-for-exhibits. An eager set-up attempts to start its computations while part of its data (one of the "all" buckets) is still being fetched.

An eager setup should be able to compute the correlations more quickly, but at increased risk of using part/all somewhat/very stale values, and thus generating a less good result.

See Also:
Constant Field Values

USE_ALL_BUCKET

private static final boolean USE_ALL_BUCKET
If true then try to use the "all" votes buckets rather than summing the other buckets. Using the "all" bucket may less accurate/comprehensive, and may delay initialisation if it has to be fetched from upstream.

See Also:
Constant Field Values

MARKER_KEY

private static final java.lang.String MARKER_KEY
Special marker key value to indicate when we last computed correlations. Legal key, but never conflicts with any normal usage.


serialVersionUID

private static final long serialVersionUID
Unique Serialisation class ID generated by http://random.hd.org/.

See Also:
Constant Field Values
Constructor Detail

ExhibitPropsComputableMutableVoteCache

public ExhibitPropsComputableMutableVoteCache()
Construct default empty cache.


ExhibitPropsComputableMutableVoteCache

private ExhibitPropsComputableMutableVoteCache(java.util.Hashtable<java.lang.String,Tuple.Pair<java.lang.Long,ExhibitPropsComputableMutable.Factor>> map)
                                        throws java.io.InvalidObjectException
Reconstruct object, eg during deserialisation. We take a defensive copy of mutable state, etc.

We also take the opportunity to strip out any invalid/stale state.

We hold a lock on the map parameter while copying its state.

Throws:
java.io.InvalidObjectException
Method Detail

calcVoteFactor

public final ExhibitPropsComputableMutable.Factor calcVoteFactor(java.lang.String exhibitName,
                                                                 AllExhibitProperties aep,
                                                                 BasicVarMgrInterface vars)
                                                          throws java.io.IOException
Compute the vote or retrieve from cache; never null. A computed/cached value is valid until the VLONG period changes.

The returned factor's goodness can range from -1 to +1, and confidence from 0 to +1; any scaling required will have to be applied elsewhere.

Specified by:
calcVoteFactor in interface ExhibitPropsComputableMutableVoteCacheIF
Parameters:
exhibitName - full, valid exhibit name; never null
aep - never null
vars - never null
Returns:
vote Factor for the given exhibit; never null.
Throws:
java.io.IOException

getCorrelates

public ExhibitPropsComputableMutable.Factor[] getCorrelates(ExhibitStaticAttr esa,
                                                            AllExhibitProperties aep,
                                                            BasicVarMgrInterface vars,
                                                            boolean force)
                                                     throws java.io.IOException
Get correlates for specified exhibit; never null but result may be empty. This is based on votes of other "related" exhibits.

The goodness of each Factor is either -1 or +1, with the correlation/confidence ranging between 0 and 1.

This will return values computed up to one period ago if need be, to help avoid a sudden splurge of CPU effort as we tick from one period to the next.

No particular ordering of the results is guaranteed, but there will be no duplicates and no nulls.

Specified by:
getCorrelates in interface ExhibitPropsComputableMutableVoteCacheIF
Parameters:
force - if true, force complete computation if need be, else we will just return what we have in cache; complete recomputation may be expensive but should last a long time and we will abort with an IOException if we cannot complete the recomputation in a reasonable time
Throws:
java.io.IOException - cannot extract required correlates

isCategoryGood

public java.lang.Boolean isCategoryGood(java.lang.String categoryDir,
                                        AllExhibitProperties aep,
                                        BasicVarMgrInterface vars,
                                        boolean force)
                                 throws java.io.IOException
Find out if a category is rated "good"/popular or not. Returns TRUE if rated good, FALSE if bad, null if not significantly either or if not known.

Specified by:
isCategoryGood in interface ExhibitPropsComputableMutableVoteCacheIF
Parameters:
categoryDir - the initial directory component of an extant exhibit
force - if true may force (expensive) computation to give a more accurate answer, else may return a more approximate or stale answer, or none at all (null)
Throws:
java.io.IOException

_findCorrelates

private void _findCorrelates(ExhibitStaticAttr esa,
                             AllExhibitProperties aep,
                             long currentPeriod,
                             java.util.List<ExhibitPropsComputableMutable.Factor> result,
                             boolean force)
Append to the result List any factors we find for this exhibit. This ignores any keys that cannot be constructed.

Parameters:
esa - the exhibit name/attr; never null
currentPeriod - current VLONG period; strictly positive
result - found correlates are appended to this value
force - if false then accept any extant value; if true then only accept values up to one period old

update

public void update(AllExhibitProperties aep,
                   BasicVarMgrInterface vars,
                   boolean noTimeLimit)
            throws java.io.IOException
Bring correlations data up to date. This may also perform housekeeping such as clearing stale state.

This may be a relatively expensive call, though if values are up-to-date then this should be quite cheap.

This is thread-safe and will use multiple caller threads to help fetch vote data and concurrently recompute exhibit scores, though second and subsequent concurrent calling threads will return immediately upon reaching the core correlation computations in order to prevent wasted duplicate effort.

If this routine cannot get values it needs then it throws an IOException.

No work will be done for an empty aep.

This will abort with an IOException if taking too long (of the order of many seconds) or cannot otherwise ensure that the correlations data is up-to-date. Thus if this returns without throwing an IOException then the correlations data can be assumed to be up-to-date.

Specified by:
update in interface ExhibitPropsComputableMutableVoteCacheIF
Parameters:
aep - current exhibit properties; never null
vars - handle on system variables; never null
noTimeLimit - if true, this runs until complete if possible
Throws:
java.io.IOException - if the correlations data cannot be guaranteed to be up-to-date and/or a problem was found fetching required data and/or the routine was taking to long to complete the calculations

writeReplace

protected java.lang.Object writeReplace()
Serialise: strip out stale information before serialising.


_clearStaleState

private static void _clearStaleState(java.util.Hashtable<java.lang.String,Tuple.Pair<java.lang.Long,ExhibitPropsComputableMutable.Factor>> map)
Remove stale state from the map passed in. A lock is held on the map while this is done to prevent it being altered unexpectedly.

We only remove things more than one period old so that we can gently recompute values in the background rather than being bounced into it as we roll into a new period.

Parameters:
map - non-null map.

readResolve

protected java.lang.Object readResolve()
                                throws java.io.ObjectStreamException
Deserialise: use constructor for validation, defensive copying, etc.

Throws:
java.io.ObjectStreamException

validateObject

public void validateObject()
                    throws java.io.InvalidObjectException
Validates the object.

Specified by:
validateObject in interface java.io.ObjectInputValidation
Throws:
java.io.InvalidObjectException - If the object cannot validate itself

DHD Multimedia Gallery V1.50.55

Copyright (c) 1996-2008, Damon Hart-Davis. All rights reserved.