org.hd.d.pg2k.svrCore
Class ExhibitName

java.lang.Object
  extended by org.hd.d.pg2k.svrCore.ExhibitName

public final class ExhibitName
extends java.lang.Object

Utility routines to validate/parse an exhibit name as a String/CharSequence. An exhibit name is a (relative) path name in a filesystem representation, and a (relative) URL in a Web presentation.

This is based on the assumption that we use the String name everywhere as a sort of universal currency, rather than pre-parsing everything as in the old Attributes.ItemName.

The syntax of the file name is:

  1. Printable 7-bit ASCII.
  2. A category directory name (consisting of characters from the range [a-z-_] starting with a character in the range [a-z]
  3. Zero or more directory components of the form _more[0-9A-Z]*
  4. A final unique `file' name consisting of hyphen-separated words ending with a recognised extension that indicates the MIME type of the underlying file; full details of the syntax of this part are given below.

The syntax of a Gallery image name is as follows: {word-}+{discardableword-}*[number-]AUTH.ext where word is an alphanumeric sequence containing at least one letter (there must be at least one such word in the name), discardableword is like word in syntax but comes from a small list of words that can trail the main image description and indicate some gross features of the image, eg `bg' and `mono' (these words are optional and will only be recognised as discardable if after all non-discardable words), number is a optional decimal number consisting purely of the digits 0-9 (with any leading zeros ignored rather than indicating octal), AUTH being the all-upper-case all-alpha author's initials (this is compulsory), and ext being the extension indicating the image type (this is compulsory).

Note that other than the extension, all components are delimited by dashes, and there must be no spaces in the name. The extension may contain dots, and dots are not allowed elsewhere in the name.

For the purposes of sorting, the sort order is first by the [word-]+ portion (ASCII order), then by the author (ASCII order), then by the number portion (numerically), then by the {discardableword-}* portion (ASCII order), then by the extension (ASCII order).


Field Summary
private static int _iDirPrefixLen
          Length of prefix of intermediate directory components in name.
private static int _minFileLen
          Minimum length of file component of any valid name.
static java.lang.String intermediateDirPrefix
          Prefix of intermediate directory components in name.
static int MAX_ATTR_WORD_LENGTH
           
static int MAX_AUTH_INITIALS_LENGTH
          Maximum length of author-initial component.
static int MAX_NAME_LENGTH
          Maximum valid name length.
static int MAX_STEM_LENGTH
          Maximum allowable length of any name stem (ie just main words).
static int MAX_WORD_LENGTH
          Maximum allowable length of any single word.
static int MIN_AUTH_INITIALS_LENGTH
          Minimum length of author-initial component; strictly positive.
static int MIN_NAME_LENGTH
          Minimum valid name length.
private static java.util.SortedSet<java.lang.String> NO_ATTR_WORDS
          Immutable empty attribute word set.
static java.util.Comparator<java.lang.CharSequence> SIMPLE_SMART_ORDER
          A simple invariant comparator that sorts full exhibit names in a human-friendly order.
static char WORD_SEP
          The character used to separate words.
static java.lang.String WORD_SEPS
          The single character used to separate words as a String value for convenience.
 
Constructor Summary
ExhibitName()
           
 
Method Summary
static java.lang.CharSequence getAttributeWordsComponent(java.lang.CharSequence fullExhibitName, java.util.Set<java.lang.String> allAttrWords)
          Extract the attribute words component of a full exhibit name, assuming the name is valid.
static java.util.Enumeration<?> getAttributeWordsComponentEnumeration(java.lang.CharSequence fullExhibitName, java.util.Set<java.lang.String> allAttrWords)
          Extract the attribute words component of a full exhibit name as an Enumeration of String, assuming the name is valid.
static java.util.SortedSet<java.lang.String> getAttributeWordsComponentSortedSet(java.lang.CharSequence fullExhibitName, java.util.Set<java.lang.String> allAttrWords)
          Extract the attribute words component of a full exhibit name as a SortedSet of String, assuming the name is valid; never null.
static java.lang.CharSequence getAuthorComponent(java.lang.CharSequence exhibitName)
          Extract the author component of a valid full or short exhibit name, assuming the name is valid.
static java.lang.CharSequence getCategoryComponent(java.lang.CharSequence fullExhibitName)
          Extract the category component (top directory) of a full exhibit name, assuming the name is valid.
static java.lang.CharSequence getDirComponent(java.lang.CharSequence fullExhibitName)
          Extract the full directory component of a full exhibit name, assuming the name is valid.
static int getEndOfAttrWords(java.lang.CharSequence exhibitName)
          Find the index of the end of the attribute words for a short or long exhibit name; strictly positive.
static int getEndOfMainWords(java.lang.CharSequence exhibitName, int lastSlash, int endOfAttrWords, java.util.Set<java.lang.String> allAttrWords)
          Find the index of the end of the main words for a short or long exhibit name; strictly positive.
static java.lang.CharSequence getExtensionComponent(java.lang.CharSequence exhibitName)
          Extract the extension (without dot) of a valid full or short exhibit name, assuming the name is valid.
static java.lang.CharSequence getFileComponent(java.lang.CharSequence fullExhibitName)
          Extract the file component (short name) of a full exhibit name, assuming the name is valid.
static int[] getMainAndAttrWordComponentBoundaries(java.lang.CharSequence exhibitName, java.util.Set<java.lang.String> allAttrWords)
          Find end of main stem and of attribute words of the supplied short or full exhibit name.
static java.util.Enumeration<?> getMainWords(java.lang.CharSequence exhibitName, java.util.Set<java.lang.String> allAttrWords)
          Return Enumeration over main words of a valid full or short name; never null, never empty if the name is well-formed.
static java.lang.CharSequence getMainWordsComponent(java.lang.CharSequence exhibitName, java.util.Set<java.lang.String> allAttrWords)
          Extract the main words (stem) component of a valid full or short exhibit name; never null nor empty.
static java.lang.CharSequence getMainWordsComponentFromShortName(java.lang.CharSequence shortExhibitName, java.util.Set<java.lang.String> allAttrWords)
          Extract the main words (stem) component of a valid short exhibit name; never null nor empty.
static int getMainWordsCount(java.lang.CharSequence exhibitName, java.util.Set<java.lang.String> allAttrWords)
          Count the main words int the (stem) component of a valid full or short exhibit name; strictly positive.
static int getNumberInSeriesComponent(java.lang.CharSequence fullExhibitName)
          Extract the number-in-series component of a full exhibit name as a non-negative int, assuming the name is valid.
static java.lang.CharSequence getNumberInSeriesComponentAsString(java.lang.CharSequence exhibitName)
          Extract the number-in-series component of a full exhibit name as a String, assuming the name is valid.
static boolean validAttributeWord(java.lang.CharSequence s)
          Checks that the CharSequence passed to it is a valid attribute word.
static boolean validAuthorSyntax(java.lang.CharSequence s)
          Validates a set of author's initials for syntax; returns true if valid.
static boolean validAuthorSyntax(java.lang.CharSequence s, int start, int end)
          Validates a set of author's initials for syntax; returns true if valid.
static boolean validNameFinalComponentSyntax(java.lang.CharSequence finalNameComponent)
          Validates the syntax of the last component of a name; returns true if valid.
static boolean validNameInitialComponentSyntax(java.lang.CharSequence initialNameComponent)
          Validates the syntax of the first component of a name; returns true if valid.
static boolean validNameInitialComponentSyntax(java.lang.CharSequence initialNameComponent, int len)
          Validates the syntax of the first component of a name; returns true if valid.
static boolean validNameSyntax(java.lang.CharSequence name)
          Fully validates the syntax of a name; returns true if valid.
static boolean validNameSyntax(Name.ExhibitFull name)
          Fully validates the syntax of a name; returns true if valid.
static boolean validNameSyntaxBasic(java.lang.CharSequence name)
          Very quick basic set of name validity checks; returns true if valid.
static boolean validNameSyntaxBasic(Name.ExhibitFull name)
          Very quick basic set of name validity checks; returns true if valid.
static boolean validWord(java.lang.CharSequence s)
          Checks that the CharSequence passed to it is a valid word (main or attribute).
static boolean validWordCharacter(char c)
          Returns true if the character passed is a valid word character.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

WORD_SEP

public static final char WORD_SEP
The character used to separate words.

See Also:
Constant Field Values

WORD_SEPS

public static final java.lang.String WORD_SEPS
The single character used to separate words as a String value for convenience.

See Also:
Constant Field Values

MAX_NAME_LENGTH

public static final int MAX_NAME_LENGTH
Maximum valid name length. This is basically limited by maximum URL and UNIX-filename length, and the fact that we need some "decoration" overhead of exhibit names when embedded in filenames, URLs, etc.

See Also:
Constant Field Values

_minFileLen

private static final int _minFileLen
Minimum length of file component of any valid name. Such a minimal name must be of the form ``a-A.a''. In practice any real name must be longer since we will not have one-character author initials nor file extensions, but this will help us quickly discard directory entries such as ``.'' and ``..'' for example.

See Also:
Constant Field Values

MIN_NAME_LENGTH

public static final int MIN_NAME_LENGTH
Minimum valid name length. Of form "a/a-A.a".

See Also:
Constant Field Values

MAX_WORD_LENGTH

public static final int MAX_WORD_LENGTH
Maximum allowable length of any single word. This is where the name is of the form a/word-A.a.

See Also:
Constant Field Values

MAX_STEM_LENGTH

public static final int MAX_STEM_LENGTH
Maximum allowable length of any name stem (ie just main words). This is where the name is of the form a/stem-A.a where the stem is one or more words.

See Also:
Constant Field Values

MAX_ATTR_WORD_LENGTH

public static final int MAX_ATTR_WORD_LENGTH
See Also:
Constant Field Values

intermediateDirPrefix

public static final java.lang.String intermediateDirPrefix
Prefix of intermediate directory components in name.

See Also:
Constant Field Values

_iDirPrefixLen

private static final int _iDirPrefixLen
Length of prefix of intermediate directory components in name.

See Also:
Constant Field Values

MIN_AUTH_INITIALS_LENGTH

public static final int MIN_AUTH_INITIALS_LENGTH
Minimum length of author-initial component; strictly positive.

See Also:
Constant Field Values

MAX_AUTH_INITIALS_LENGTH

public static final int MAX_AUTH_INITIALS_LENGTH
Maximum length of author-initial component.

See Also:
Constant Field Values

SIMPLE_SMART_ORDER

public static final java.util.Comparator<java.lang.CharSequence> SIMPLE_SMART_ORDER
A simple invariant comparator that sorts full exhibit names in a human-friendly order. This is essentially a case-insensitive sort on the file component.

Ties are broken by a normal lexical ordering on the full names.

Discardable/attribute words are not discarded nor otherwise treated specially for this comparison.


NO_ATTR_WORDS

private static final java.util.SortedSet<java.lang.String> NO_ATTR_WORDS
Immutable empty attribute word set.

Constructor Detail

ExhibitName

public ExhibitName()
Method Detail

validNameInitialComponentSyntax

public static boolean validNameInitialComponentSyntax(java.lang.CharSequence initialNameComponent,
                                                      int len)
Validates the syntax of the first component of a name; returns true if valid. This is helpful to identify the roots of a directory scan, for example.

This ignores any characters from len onwards, so the portion from 0--len-1 must be a complete valid initial directory, and the string passed must be at least len characters long.

Designed to be efficiently callable from validNameSyntax() without requiring creation of any objects.


validAuthorSyntax

public static boolean validAuthorSyntax(java.lang.CharSequence s,
                                        int start,
                                        int end)
Validates a set of author's initials for syntax; returns true if valid. This version examines part of a string.

Characters from start to end-1 are checked.

The start and end positions must be valid in the CharSequence passed and start start must come before end.

Returns false if the string is null.

Parameters:
s - the value to be examined
start - the starting position of the author; must be valid offset within string
end - just after the end of the author; must be greater than start and no greater than the string length

validAuthorSyntax

public static boolean validAuthorSyntax(java.lang.CharSequence s)
Validates a set of author's initials for syntax; returns true if valid. The whole string must be a valid author ID (and not null).

Returns:
true iff s is not null and is a valid set of author initials

validWord

public static boolean validWord(java.lang.CharSequence s)
Checks that the CharSequence passed to it is a valid word (main or attribute). This means it must be non-zero length (and non-null), and consist only of letters and digits.


validAttributeWord

public static boolean validAttributeWord(java.lang.CharSequence s)
Checks that the CharSequence passed to it is a valid attribute word. This means it must be non-zero length (and non-null), and consist only of letters and digits and must not consist entirely of digits or upper-case letters (to avoid ambiguity with the number-in-series value and author).

Maximum attribute word must be at most long enough to allow for a single-letter main word and dash, so is two less than the longest allowable main word.


validNameInitialComponentSyntax

public static boolean validNameInitialComponentSyntax(java.lang.CharSequence initialNameComponent)
Validates the syntax of the first component of a name; returns true if valid. This is helpful to identify the roots of a directory scan, for example.


validWordCharacter

public static boolean validWordCharacter(char c)
Returns true if the character passed is a valid word character. A valid word character is an ASCII digit or letter (either case).

We test the most common cases first for speed.


validNameFinalComponentSyntax

public static boolean validNameFinalComponentSyntax(java.lang.CharSequence finalNameComponent)
Validates the syntax of the last component of a name; returns true if valid. This is helpful when running a directory scan, for example.

This does not check that the author's initials or the extension are actually acceptable other than that they are syntactically valid.


validNameSyntaxBasic

public static boolean validNameSyntaxBasic(Name.ExhibitFull name)
Very quick basic set of name validity checks; returns true if valid. Short-cut where the type is statically known to be Name.ExhibitFull; just checks the value is non-null and if is assumes to already have been validated.


validNameSyntaxBasic

public static boolean validNameSyntaxBasic(java.lang.CharSequence name)
Very quick basic set of name validity checks; returns true if valid. Very quick constant-time checks that the name is not null and is of a legitimate length.


validNameSyntax

public static boolean validNameSyntax(Name.ExhibitFull name)
Fully validates the syntax of a name; returns true if valid. Short-cut where the type is statically known to be Name.ExhibitFull; just checks the value is non-null and if is assumes to already have been validated.


validNameSyntax

public static boolean validNameSyntax(java.lang.CharSequence name)
Fully validates the syntax of a name; returns true if valid. It attempts to be fast and to not create too many intermediate/temporary objects.

This does not attempt to check a name against current databases nor return any parsed components.


getFileComponent

public static java.lang.CharSequence getFileComponent(java.lang.CharSequence fullExhibitName)
Extract the file component (short name) of a full exhibit name, assuming the name is valid. Two exhibits should always be distinguishable by this component, also known as the "short" name.

If the argument is not a valid full exhibit name, the result is undefined.

See also ExhibitFullName.getShortName().


getCategoryComponent

public static java.lang.CharSequence getCategoryComponent(java.lang.CharSequence fullExhibitName)
Extract the category component (top directory) of a full exhibit name, assuming the name is valid. If the argument is not a valid full exhibit name, the result is undefined.


getDirComponent

public static java.lang.CharSequence getDirComponent(java.lang.CharSequence fullExhibitName)
Extract the full directory component of a full exhibit name, assuming the name is valid. This does not include the trailing directory separator.

If the argument is not a valid full exhibit name, the result is undefined.


getEndOfAttrWords

public static int getEndOfAttrWords(java.lang.CharSequence exhibitName)
Find the index of the end of the attribute words for a short or long exhibit name; strictly positive.

Parameters:
exhibitName - valid full or short exhibit name; never null

getEndOfMainWords

public static int getEndOfMainWords(java.lang.CharSequence exhibitName,
                                    int lastSlash,
                                    int endOfAttrWords,
                                    java.util.Set<java.lang.String> allAttrWords)
Find the index of the end of the main words for a short or long exhibit name; strictly positive.

Parameters:
exhibitName - valid full or short exhibit name; never null
lastSlash - position of last '/' or -1 for a short name
endOfAttrWords - as returned by getEndOfAttrWords()
allAttrWords - a Set of all legal attribute words (String values); may be empty but not null

getMainWordsComponentFromShortName

public static java.lang.CharSequence getMainWordsComponentFromShortName(java.lang.CharSequence shortExhibitName,
                                                                        java.util.Set<java.lang.String> allAttrWords)
Extract the main words (stem) component of a valid short exhibit name; never null nor empty. As getMainWordsComponentFrom() but optimised for this common case, and should be more efficient than extracting from a full name as less text to scan.

Parameters:
allAttrWords - a Set of all legal attribute words (String values); may be empty but not null

getMainAndAttrWordComponentBoundaries

public static int[] getMainAndAttrWordComponentBoundaries(java.lang.CharSequence exhibitName,
                                                          java.util.Set<java.lang.String> allAttrWords)
Find end of main stem and of attribute words of the supplied short or full exhibit name. This returns a three-element array, of which:

This has to be passed a set of all valid attribute words (as Strings which meet the requirements of validAttributeWord()) to be able to compute this boundary.

If the argument is not a valid full exhibit name, the result is undefined.

Parameters:
allAttrWords - a Set of all legal attribute words (String values); may be empty but not null

getMainWordsComponent

public static java.lang.CharSequence getMainWordsComponent(java.lang.CharSequence exhibitName,
                                                           java.util.Set<java.lang.String> allAttrWords)
Extract the main words (stem) component of a valid full or short exhibit name; never null nor empty. There is always at least one main word; so the result is always non-null and non-empty.

(This does not end nor start with a separator.)

If the argument is not a valid full or short exhibit name, the result is undefined.

Parameters:
allAttrWords - a Set of all legal attribute words (String values)

getMainWordsCount

public static int getMainWordsCount(java.lang.CharSequence exhibitName,
                                    java.util.Set<java.lang.String> allAttrWords)
Count the main words int the (stem) component of a valid full or short exhibit name; strictly positive.

Parameters:
allAttrWords - a Set of all legal attribute words (String values)

getMainWords

public static java.util.Enumeration<?> getMainWords(java.lang.CharSequence exhibitName,
                                                    java.util.Set<java.lang.String> allAttrWords)
Return Enumeration over main words of a valid full or short name; never null, never empty if the name is well-formed. Uses StringTokenizer, thus slow and inefficient.


getAttributeWordsComponent

public static java.lang.CharSequence getAttributeWordsComponent(java.lang.CharSequence fullExhibitName,
                                                                java.util.Set<java.lang.String> allAttrWords)
Extract the attribute words component of a full exhibit name, assuming the name is valid. If there are no attribute words this returns null, else the result is non-empty and is the fragment of the full name containing the attribute word with the words separated as usual.

If the argument is not a valid full exhibit name, the result is undefined.

Parameters:
allAttrWords - a Set of all legal attribute words (String values)

getAttributeWordsComponentEnumeration

public static java.util.Enumeration<?> getAttributeWordsComponentEnumeration(java.lang.CharSequence fullExhibitName,
                                                                             java.util.Set<java.lang.String> allAttrWords)
Extract the attribute words component of a full exhibit name as an Enumeration of String, assuming the name is valid. If there are no attribute words this returns null, else the result is non-empty and is an Enumeration of String values of the attributes in order.

If the argument is not a valid full exhibit name, the result is undefined.

Parameters:
allAttrWords - a Set of all legal attribute words (String values)

getAttributeWordsComponentSortedSet

public static java.util.SortedSet<java.lang.String> getAttributeWordsComponentSortedSet(java.lang.CharSequence fullExhibitName,
                                                                                        java.util.Set<java.lang.String> allAttrWords)
Extract the attribute words component of a full exhibit name as a SortedSet of String, assuming the name is valid; never null. If there are no attribute words this returns a (fixed, immutable) empty set.

Duplicates attribute words are automatically eliminated

If the argument is not a valid full exhibit name, the result is undefined.

Parameters:
allAttrWords - a Set of all legal attribute words (String values)
Returns:
non-null, de-duped, alpha-sorted attribute words from the name

getNumberInSeriesComponentAsString

public static java.lang.CharSequence getNumberInSeriesComponentAsString(java.lang.CharSequence exhibitName)
Extract the number-in-series component of a full exhibit name as a String, assuming the name is valid. If the argument is not a valid full exhibit name, the result is undefined unless the name is the final (file) component of a valid name.

A missing number-in-series value causes us to return null.

This is:


getNumberInSeriesComponent

public static int getNumberInSeriesComponent(java.lang.CharSequence fullExhibitName)
Extract the number-in-series component of a full exhibit name as a non-negative int, assuming the name is valid. If the argument is not a valid full exhibit name, the result is undefined.

A missing number-in-series value causes us to return zero.

This is:

Returns:
positive number-in-series value, or zero if absent

getAuthorComponent

public static java.lang.CharSequence getAuthorComponent(java.lang.CharSequence exhibitName)
Extract the author component of a valid full or short exhibit name, assuming the name is valid. If the argument is not a valid full exhibit name, the result is undefined unless the name is the final (file/short) component of a valid name.


getExtensionComponent

public static java.lang.CharSequence getExtensionComponent(java.lang.CharSequence exhibitName)
Extract the extension (without dot) of a valid full or short exhibit name, assuming the name is valid. If the argument is not a valid full exhibit name, the result is undefined unless the name is the final (file/short) component of a valid name.


DHD Multimedia Gallery V1.57.21

Copyright (c) 1996-2011, Damon Hart-Davis. All rights reserved.