-c [ NOASCII DH ]
   
          Chop up the words into separate characters before doing
          the alignment.  It is generally not the practice of the
          ARPA community to score at the  character  level.   The
          intent  of  this option is to be able to score Mandarin
          Chinese at the character level.  The  option  "NOASCII"
          does  not  separate  characters if they are ASCII.  The
          option "DH"  deletes  hyphens  from  the  ref  and  hyp
          strings before alignment.  This option only works using
          the DP alignment algorithm.  (-c & -d are exclusive)
   
   -d
   
          Use GNU diff
	  for alignments  rather  than  the  default
          dynamic programming.  (-c & -d are exclusive)
   
    -F
   
	  Perform the  alignment  using  a  cost  function  which
          counts  fragments,  words  ending  or  beginning with a
          hyphen, as correct if the spelling  up  to  the  hyphen
          matches the spelling of the hypothesized word.
          Options -F and -d are exclusive.
   
    -L LM
   
Define the 
CMU-Cambridge Statistical Language Modeling Toolkit v2 language
mode file to be 'LM'.  The LM file must be created using the
idngram2lm program.
(See the toolkit documentation details of how
to make the language model.)  Currently, SCTK supports 1, 2 and 3-grams.
 The language model is used to compute an individual weight for each
word in the reference and hypothesis strings.  The weight is defined
to be Log2(P(word|context)).  Each pair of aligned
strings is considered to be independent, so therefore, there is
no context for initial words in each pair.
The word-weights are used in two ways, first as a method to define word-to-word distances
for  word-weight-mediated alignment 
and second to perform  
weighted word scoring .
 Out-of-Vocabulary words get the default weight of 20.0, and optionally
deletable words get a default weight of 0.0.
   
    -m [ ref | hyp ]
   
          When scoring a hypothesis ctm file against a  reference
          stm  file,  the  time  spans  of the two may not match,
          (i.e. the start time of the first word/segment may  not
          match  or the end time of the last word/segment may not
          match).
	
          When this option is used, the alignment phase of  scoring
          ignores  any  segment  or  word  (depending on the
          option(s) used) which is not in the time  span  of  the
          opposite  file.   The time span of a file is defined to
          be start time of the first time mark, to the  end  time
          of the last time mark.
	
          The "ref" option  reduces  the  reference  segments  to
          those which are within the hypothesis file time span.
	
          The "hyp" option reduces the hypothesis words to  those
          which are within the reference file tiem span.
	
          Both "ref" and "hyp" may be used simultaneously.
	
          The  argument  -m  by  itself  defaults  to  '-m  ref'.
          Exclusive with -d.
   
 
   -s
   
          Do Case-sensitive alignments.  Otherwise all input is mapped to 
	  a single case before scoring.  Of course, GB and UEX encode text
	  data is never case-converted.
   
    -S algo1 lexicon [ ASCIITOO ] 
	
          The '-S' option performs an inferred word  segmentation
          alignment algorithm.  This  option
          is intended to be used for the LVCSR evaluation of Mandarin
	  Chinese.  A problem with scoring Mandarin at  the
          word level is the lack of clearly defined words in Mandari
	  text.  This option implements an algorithm which,
          given  a word segmentation for the reference string and
          a "lexicon" of legal words, computes  a  minimal  error
          rate word alignment.  The algorithm is as follows:
	  
 
	  
          -   Convert  the  previously  word-segmented  reference
          string into a word network.
          
-  Covert the hypothesis text to a string  of  characters,
	  each  character  representing  a word.  The data
          represented is then convert to a network.
ex.    * --- A --- * --- T --- * --- 0 --- *
 
-  Consider all possible sequences of letters  through
          the  network.   If  a  sequence creates a word which is
          represented in the lexicon, add an arc to  the  network
          representing the word.  The maximum characters per word
          is limited to the maximum word length in the lexicon.
                     ,-------- TO -------.
                    /                     \
ex.    * --- A --- * --- T --- * --- 0 --- *
        \                     /
         `------- AT --------'
-  DP Align the reference and hypothesis networks, and
          extract a minimal cost path.
	  
          The supplied "lexicon" must be a sorted  list  of  word
          records,  each  separated by a newline.  Only the first
          column, separated by whitespace, is read  in  and  used
          for  the  lexicon.   By  default,  the  algorithm  only
          separates hypothesis characters  that  are  GB  or  EUC
          encoded.   If  the  option  "ASCIITOO"  is  used, ASCII
          hypothesis words are also converted  to  characters  in
          step 2.
	  
Exclusive with -d.	  
	
    -S algo2 lexicon [ ASCIITOO ] 
   
          Perform a similar algorithm as described in '-S alog1' except
          the roles of the reference and hypothesis transcripts are reversed.
          In this algorithm, the segmentation of the hypothesis text is held
	  constant, while the reference transcript undergoes the process of 
          of coversion to characters and arcs added to the network for words
	  found in the lexicon.  Both "lexicon" and "ASCIITOO" have the same
	  usage as in algo1.  
	  Exclusive with -d.	  
   
    -T
    
    -w wwl_file
   
Define the word-weight list (WWL) file to be 'wwl_file'.  The WWL file 
defines an arbitrary weight for each word in the lexicon.  The weights are
used in two ways, first as a method to define word-to-word distances
for  word-weight-mediated alignment 
and second to perform  
weighted word scoring .
 If the supplied WWL filename is "unity", then no file of weights is read in.
Instead, this is  a shorthand notation to use a weight of 1.0 for all words.
 Optionally deletable words get a default weight of 0.0, (even if "unity"
is supplied as the WWL filename).
 The format of the WWL file is as follows. 
	    Comment lines begin with
	    double semi-colons.  The are two forms of "special" comment lines.  The
	    first defines heading labels each column in the table.  The format for this
	    line is: 
 
		 ;; 'Headings' '<COL1>' '<COL2>' '<COL3>' .... 
	    The label for column 1 should be "Word Spelling" since this column is the
	    word's text.  The labels for columns 2 though 10 are defined by the user.
	    
	    The second "special" comment line defines the default weight applied to
	    out-of-vocabulary words if any exist.  The format for this line is:  
 
		
 ;; Default missing weight '<number>' 
	    'number' must be a floating point number. 
	     
	    The remainder of the file consists of word records, each word record separated by
	    a newline.  The format of each record is: 
 
		
 <WORD_TEXT> <WEIGHT_1> <WEIGHT_2> . . . 
	    There should be no whitespace at the beginning if the line, and the word
	    texts can not include whitespace.  The remainder of the line are whitespace
	    separated floating point weights, up to a maximum of 10 weights can 
	    be assigned per word.
	    
	    NOTE: The current version of SCTK only utilizes the first weight.