|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.hd.d.pg2k.ai.scorer.AbstractScorerCache
org.hd.d.pg2k.ai.scorer.ScorerCacheImpl
public final class ScorerCacheImpl
Simple/default implementation to compute (and cache) the score and confidence for exhibits. Note: since the result of this computation may be used in computing (EPCM) the ExhibitPropsComputableMutable value for an exhibit, then any implementation of this must avoid forcing recaclulation of any EPCM value to avoid danger of infinite recursion (other than the static calcVoteFactor() method). Ideally the value computed will not depend on any EPCM value.
| Field Summary | |
|---|---|
private MemoryTools.SimpleLRUMap<Tuple.Triple<java.lang.String,java.lang.Integer,java.lang.Boolean>,Tuple.Pair<java.lang.Long,java.util.Map<java.lang.String,ScoreAndConf>>> |
_eCS_cache
Private cache for extractCalibrationSet(); never null. |
private java.util.SortedMap<java.lang.Long,java.util.Map<java.lang.String,ScoreAndConf>> |
_votedForExhibits
Get all voted-for exhibits (by short name) and their converted vote factors; never null. |
private static int |
APPROX_POLL_CYCLE_MS
Expected approximate poll cycle time (ms); strictly positive. |
static java.util.Map<java.lang.String,ScorerIF> |
fixedSimpleScorers
The (small) immutable current fixed set of parameterless base Scorer instances. |
private java.util.concurrent.ArrayBlockingQueue<java.lang.String> |
inboundScorerQueue
Thread-safe, bounded size, queue for inbound Scorers; never null. |
private static int |
MAX_ECS_ITERATIONS
Iterations allowed in extractCalibrationSet() to try to find an optimal result; strictly positive. |
private static int |
MAX_SCORER_SAMPLE_SIZE_CCS
Max Scorer sample size in _computeCalibrationSetError() to prevent overflow in calculations. |
private static int |
MAX_SCORER_SCORES_RETAINED
Maximum number of Scorers for which retain/cache scores; strictly positive. |
private static long |
MIN_VOTE_SAMPLE_NEAR_TERM_TIME_MS
The minimum recent period for which we collect all available votes, in ms; strictly positive. |
(package private) static int |
MIN_VOTED_FOR_SET_SIZE
Minimum desireable site of set of voted-for exhibits for calibration to be fully confident of the result; strictly positive. |
private ScorerCreator.ScorerWork |
scorerWork
Scorer work object; never null. |
private static boolean |
STATISTICAL_SCORER_SAMPLING_CCS
If true then select a partly-random selection of Scorers to measure error in _computeCalibrationSetError(). |
| Fields inherited from class org.hd.d.pg2k.ai.scorer.AbstractScorerCache |
|---|
dataSource, log, population |
| Fields inherited from interface org.hd.d.pg2k.ai.scorer.ScorerCacheIF |
|---|
TRIVIAL |
| Constructor Summary | |
|---|---|
ScorerCacheImpl(SimpleExhibitPipelineIF dataSource,
SimpleLoggerIF log)
Construct an instance attached to the supplied data source. |
|
| Method Summary | |
|---|---|
private int |
_computeCalibrationSetError(java.util.Map<java.lang.String,ScoreAndConf> putativeCalibrationExhibits,
java.util.List<java.lang.String> breedingSet,
java.util.concurrent.ConcurrentMap<java.lang.String,java.util.concurrent.ConcurrentMap<java.lang.String,ScoreAndConf>> workspace)
Compute 'error' in calibration set for the given population and cache contents; non-negative. |
boolean |
canAcceptMoreExternalScorers()
Returns true if this cache can definitely accept (many) more externally-supplied Scorer values. |
ScoreAndConf |
computeCompositeScoreAndConfidence(java.lang.String exhibitName,
boolean allowStale)
Computes a weighted composite score [-1,+1] and confidence [0,+1] for the specified exhibit with the best available scorers/parameters; never null but may be (0,0). |
private Tuple.Pair<java.lang.Long,java.util.Set<java.lang.String>> |
computeRawVotedForExhibits(SimpleVariablePipelineIF vars)
Compute voted-for-exhibit set; never null but may be empty. |
ScoreAndConf |
computeScorerWeighting(ScorerIF scorer,
boolean allowStale,
java.lang.String source)
ScoreAndConfidence for the given scorer itself over all exhibit types; never null but may be (0,0) where the scorer is unknown or untested. |
private void |
evolve(boolean minimal)
Do some 'evolution' in a background thread if possible. |
java.util.Map<java.lang.String,ScoreAndConf> |
extractCalibrationSet(java.lang.String baseName,
int maxSamples,
java.lang.Boolean difficult,
boolean allowStale)
Compute exemplar exhibit sub-set to calibrate Scorers with given base name against; never null but may be empty. |
ScorerIF |
getBaseScorerByName(java.lang.String baseName)
Get base non-parameterised Scorer by name; null if no such base Scorer supported. |
java.util.Set<java.lang.String> |
getBaseScorersWithoutParameters()
Base set of available Scorers' names (no parameters); never null but may be empty. |
private java.util.Map<java.lang.String,ScoreAndConf> |
getVotedForExhibitsAndVoteFactors(SimpleExhibitPipelineIF dataSource,
boolean allowStale)
Get latest (immutable) Map of all voted-for exhibits (by short name) to vote score; never null. |
boolean |
hasQueuedExternalScorer()
Returns true if at least once external Scorer is queued waiting to be processed. |
boolean |
offerExternalScorer(java.lang.String externalScorerNameAndParameters)
Attempt to queue an externally-supplied Scorer value; returns true if accepted. |
void |
poll()
Called/polled periodically (of the order of 1Hz) to do donkey-work and background tasks. |
| Methods inherited from class org.hd.d.pg2k.ai.scorer.AbstractScorerCache |
|---|
computeScorerWeighting, computeUnweightedScoreAndConfidence, destroy, getCurrentScorersWithParameters, getDataSource, getPopulation, getScorerInstance, size |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface org.hd.d.pg2k.ai.scorer.ScorerCacheIF |
|---|
computeScorerWeighting, computeUnweightedScoreAndConfidence, getCurrentScorersWithParameters, getScorerInstance, size |
| Field Detail |
|---|
private static final int MAX_SCORER_SCORES_RETAINED
In practice 2^14--2^16 seems to be a reasonable limit.
private static final long MIN_VOTE_SAMPLE_NEAR_TERM_TIME_MS
private final java.util.SortedMap<java.lang.Long,java.util.Map<java.lang.String,ScoreAndConf>> _votedForExhibits
Private to getVotedForExhibits().
We index from the time at which we gathered the set so that we can regenerate or trim this list if the aep or event slot changes.
Because the (first) map is sorted it is quick to find the latest/oldest entries.
Usually we only keep one value (the latest one), and any older values are purged only AFTER a new value has been inserted, ie this map never becomes transiently empty nor more stale than necessary once non-empty.
We can keep older information around to allow some (stale) results to be computed even if we cannot immediately fetch new information.
The outer and inner maps are thread-safe.
Do not assume that other activity in this collection can be locked out by holding a lock on this object. DO NOT hold a lock on this object!
The values must be immutable to enable them to be shared safely without copying.
TODO: upgrade to ConcurrentNavigableMap under JDK 1.6 to improve concurrency.
public static final java.util.Map<java.lang.String,ScorerIF> fixedSimpleScorers
These include the test/bogus Scorers as well as the live Scorers.
static final int MIN_VOTED_FOR_SET_SIZE
private static final int MAX_ECS_ITERATIONS
private static final boolean STATISTICAL_SCORER_SAMPLING_CCS
private static final int MAX_SCORER_SAMPLE_SIZE_CCS
private final MemoryTools.SimpleLRUMap<Tuple.Triple<java.lang.String,java.lang.Integer,java.lang.Boolean>,Tuple.Pair<java.lang.Long,java.util.Map<java.lang.String,ScoreAndConf>>> _eCS_cache
Map from (effectively) basename/size/errors to a calibration set and its time of computation.
The size parameter is the largest (non-negative) power of two no higher than the request size (its rounded-down log to the base 2), thus mapping arbitrary requested maximum sizes to a smaller number of powers of two. (Note that this mapping may be further conflated/condensed to improve the cache effectiveness.)
private final ScorerCreator.ScorerWork scorerWork
private static final int APPROX_POLL_CYCLE_MS
private final java.util.concurrent.ArrayBlockingQueue<java.lang.String> inboundScorerQueue
| Constructor Detail |
|---|
public ScorerCacheImpl(SimpleExhibitPipelineIF dataSource,
SimpleLoggerIF log)
dataSource - full-access live data source; must not be null| Method Detail |
|---|
public ScoreAndConf computeCompositeScoreAndConfidence(java.lang.String exhibitName,
boolean allowStale)
throws java.io.IOException
This explicitly limits the time it spends on computation, more so if the system is short of power.
This tries to force the evolution of new Scorers in the background if it finds none at all useable during filtering and computation. (This means that the Scorer system can bootstrap itself just by calls to this routine, but will force some possibly-wasteful work for exhibit types that cannot be scored.)
computeCompositeScoreAndConfidence in interface ScorerCacheIFexhibitName - valid full exhibit nameallowStale - if true then allow a stale value from cache,
else throw an exception if nothing is currently available
java.io.IOException
public ScoreAndConf computeScorerWeighting(ScorerIF scorer,
boolean allowStale,
java.lang.String source)
throws java.io.IOException
We attempt to ignore small numbers of errors for robustness.
computeScorerWeighting in interface ScorerCacheIFallowStale - if true then allow a stale or low-confidence value from cache,
else throw an exception if nothing is currently available
and we cannot quickly compute enough points to increase our confidencescorer - instance of the Scorer; never nullsource - the name of the mechanism used to generate this Scorer value,
or null if none
java.io.IOException
private Tuple.Pair<java.lang.Long,java.util.Set<java.lang.String>> computeRawVotedForExhibits(SimpleVariablePipelineIF vars)
throws java.io.IOException
This does NOT attempt to filter the result in any way, ie by which exhibits are still live or even syntactically-valid.
java.io.IOException - if the data is not available or there is a timeout
private java.util.Map<java.lang.String,ScoreAndConf> getVotedForExhibitsAndVoteFactors(SimpleExhibitPipelineIF dataSource,
boolean allowStale)
throws java.io.IOException
Uses a cache to avoid wasted computation from one event tick to the next, and to allow this mechanism to work even in the face of brief outages in data connectivity (eg when trying to fetch new values of the vote events).
We convert the vote Factor to a ScoreAndConf value so that more computations can be done as integer rather than floating, especially important for Niagara.
dataSource - event source for fetching event set if necessary; never nullallowStale - if true, allow older data to be returned to save time,
or if new event data should be used but cannot be fetched
(ie allow working from stale event data for speed and/or robustness)
java.io.IOExceptionpublic java.util.Set<java.lang.String> getBaseScorersWithoutParameters()
getBaseScorersWithoutParameters in interface ScorerCacheIFpublic ScorerIF getBaseScorerByName(java.lang.String baseName)
getBaseScorerByName in interface ScorerCacheIFbaseName - base (no parameters) name of Scorer; must not be null
private int _computeCalibrationSetError(java.util.Map<java.lang.String,ScoreAndConf> putativeCalibrationExhibits,
java.util.List<java.lang.String> breedingSet,
java.util.concurrent.ConcurrentMap<java.lang.String,java.util.concurrent.ConcurrentMap<java.lang.String,ScoreAndConf>> workspace)
throws java.io.IOException
This implicitly assumes that using the population ranking/comparator is the same as calibrating against the input calibration data (though that may prove false for stale cache values, etc). This avoids potentially-expensive recomputation against all input data.
Note that nothinf is ever removed from the workspace (if supplied) making it (thread-) safe to share even between concurrent calls, but meaning also that its data may be bulky and get stale if preserved longer than the call to extract one calibration set. Use of this retained workspace should however save a great dea of time.
putativeCalibrationExhibits - map of proposed exhibits for calibration set
to their real original input data (eg votes);
never null nor emptybreedingSet - set of 'best' Scorers (of a single base type) or null for a generic set;
never emptyworkspace - if non-null then this is some workspace opaque to the caller
that can be used to save some results from one call to the next
(or even between concurrent calls, though this may not be efficient)
while evaluating the error for a single calibration set;
the caller must not alter this object
java.io.IOException
public java.util.Map<java.lang.String,ScoreAndConf> extractCalibrationSet(java.lang.String baseName,
int maxSamples,
java.lang.Boolean difficult,
boolean allowStale)
throws java.io.IOException
This may return an empty result if there is not enough data to calibrate against, eg human-case votes.
It may be possible to tune or pre-test new Scorers against the results of this as a fast filter.
If a base name is specified that is invalid, it is treated as if null.
We may cache generated results for a short while since this extraction may be expensive, though the cache time is of the order of minutes rather than the day or so elsewhere.
extractCalibrationSet in interface ScorerCacheIFbaseName - base name of Scorer to extract calibration set for,
or null for a generic all-Scorers calibration setmaxSamples - the maximum number of samples to return; strictly positivedifficult - if TRUE the return the difficult cases that we do not predict well,
if FALSE then return the easy cases that we predict well,
else return a mixure of good, bad, and other random casesallowStale - if true then allow slightly older data for speed and robustness
java.io.IOExceptionprivate void evolve(boolean minimal)
This routine always returns quickly.
This should never throw any exceptions.
minimal - if true, run as short a time as possible to make some progress
public final void poll()
throws java.io.IOException
This launches its work in a low-priority daemon thread, and limits the number of such concurrent work threads globally by silently discarding any excess, ie this call always returns quickly.
This routine should not be called (often) if the host system is under heavy load.
java.io.IOExceptionpublic boolean offerExternalScorer(java.lang.String externalScorerNameAndParameters)
Note that this accepts (and discards) anything which is definitely never usable, eg syntactically invalid or not for a valid Scorer base name, so as to help conserve resources such as memory and queue space.
This routine should always be relatively quick/efficient; much more work is performed when an item is dequeued, presumably when we know that we have ample resources available.
This may try to start some work immediately if the queue is now getting full.
Should not throw any exceptions; invalid Scorer values are simply discarded with true returned.
offerExternalScorer in interface ScorerCacheIFofferExternalScorer in class AbstractScorerCachepublic boolean canAcceptMoreExternalScorers()
True if our internal queue is at most about half full, but not any sort of guarantee that we can actually accept another Scorer.
In power-conserving mode only welcome new entries if the queue is currently empty, ie if we seem to be keeping up with any data coming in. This is important because this call can be used to decide when to hand out new work to clients, from which the results will arrive only much later.
canAcceptMoreExternalScorers in interface ScorerCacheIFcanAcceptMoreExternalScorers in class AbstractScorerCachepublic boolean hasQueuedExternalScorer()
hasQueuedExternalScorer in interface ScorerCacheIFhasQueuedExternalScorer in class AbstractScorerCache
|
DHD Multimedia Gallery V1.50.55 | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||