+.th SPEAK I 8/15/73
+.if t .ds A \o"a\(ga"
+.if n .ds A a\b`
+.if t .ds v \|\(bv
+.sh NAME
+speak \*- word to voice translator
+.sh SYNOPSIS
+.bd speak
+[
+.bd \*-epsv
+] [ vocabulary [ output ] ]
+.sh DESCRIPTION
+.it Speak
+turns a stream of words
+into utterances and outputs them to a voice synthesizer,
+or to a specified output file.
+It has facilities for maintaining a vocabulary.
+It receives, from the standard input
+.s3
+.lp +5 3
+\*- working lines: text of words separated by blanks
+.lp +5 3
+\*- phonetic lines: strings of phonemes for one word preceded
+and separated by commas.
+The phonemes may be followed by comma-percent then a `replacement
+part' \*- an ASCII string with no spaces.
+The phonetic code is given in vsp(VII).
+.lp +5 3
+\*- empty lines
+.lp +5 3
+\*- command lines: beginning with
+.bd !.
+The following command lines
+are recognized:
+.s3
+.lp +15 10
+\fB!r\fR file replace coded vocabulary from file
+.lp +15 10
+\fB!w\fR file write coded vocabulary on file
+.lp +15 10
+\fB!p\fR print parsing for working word
+.lp +15 10
+\fB!l\fR list vocabulary on standard output with phonetics
+.lp +15 10
+\fB!c\fR word copy phonetics from working word to
+specified word
+.lp +15 10
+\fB!d\fR print phonetics for working word
+.s3
+.i0
+Each working line replaces its predecessor.
+Its first word is the `working word'.
+Each phonetic line replaces the phonetics stored for the
+working word.
+In particular, a phonetic line of comma only deletes the
+entry for the working word.
+Each working line, phonetic line or empty line
+causes the working line to be uttered.
+The process terminates at the end of input.
+.s3
+Unknown words are pronounced by rules, and failing that,
+are spelled.
+Spelling is done by taking each character of
+the word, prefixing it with *, and looking it up.
+Unspellable words burp.
+.s3
+.it Speak
+is initialized with a coded vocabulary stored in file
+/usr/lib/speak.m.
+The vocabulary option substitutes a different file for
+/usr/lib/speak.m.
+.s3
+A set of single letter options may
+appear in any order preceded by
+.bd \*-.
+Their meanings are:
+.s3
+.lp +8 4
+\fB\*-e\fR suppress English steps (4\*-8) below
+.lp +8 4
+\fB\*-p\fR suppress pronunciation by rule
+.lp +8 4
+\fB\*-s\fR suppress spelling
+.lp +8 4
+\fB\*-v\fR suppress voice output
+.s3
+.i0
+.ne4
+The steps of pronunciation by rule are:
+.s3
+.lp +5 5
+(1) If there were no lower case letters in the working line,
+fold all upper case letters to lower.
+.lp +5 5
+(2) Fold an initial cap to lower case, and try again.
+.lp +5 5
+(3) If word has only one letter, or has no lower case vowels, quit.
+.lp +5 5
+(4) If there is a final
+.it s,
+strip it.
+.lp +5 5
+(5) Replace final \*-ie by \*-y.
+.lp +5 5
+(6) If any changes have been made, try whole word again.
+.lp +5 5
+(7) Locate probable long vowels and capitalize them.
+Mark probable silent \fIe\fR's.
+.lp +5 5
+(8) Put back the
+.it s
+stripped in (4), if any.
+.lp +5 5
+(9) Place # before and after word.
+.lp +5 5
+(10) Prefix word with
+.it %,
+and look up longest initial match
+in the stored table of words; if none, quit.
+.lp +5 5
+(11) Use phonemes from the stored phonetic string as
+pronunciation, and replace the matched stuff by the
+replacement part of the phonetic string.
+.lp +5 5
+(12) If anything remains, go to (10).
+.s3
+.i0
+Long vowels are located this way in step (7):
+.s3
+.lp +5 5
+(1) A \fIu\fR appearing in context
+[^aeiou]u[^aeiouwxy][aieouy].
+(The notation is just a regular expression \*A la ed(I).)
+.ft I
+(pustUlous)
+.ft R
+.lp +5 5
+(2) One of [aeo] appearing in the context
+[aeo][^aehiouwxy][ie][aou]
+or in the context [aeo][^aehiouwxy]ien is assumed long.
+The digram \fIth\fR behaves as a single letter in this test.
+.ft I
+(rAdium, facEtious, quOtient, carpAthian)
+.ft R
+.lp +5 5
+(3) If the first vowel in the word is \fIi\fR followed by one of
+\fIaou,\fR it is assumed long.
+.ft I
+(Iodine, dIameter, trIumph)
+.ft R
+.lp +5 5
+(4) If the only vowel in the word is final \fIe,\fR the vowel is
+assumed long.
+.ft I
+(bE, shE)
+.ft R
+.lp +5 5
+(5) If the only vowels in the word appear in the pattern
+[aeiouy][^aeiouwxy]S, where S is one of the suffixes
+.br
+.dt
+ \*-al \*-le \*-re \*-y
+.br
+.lp +5 0
+then the first vowel is assumed long.
+.ft I
+(glObal, tAble, lUcre, lAdy)
+.ft R
+.lp +5 5
+(6) If no suffix was found in (5),
+as many of these suffixes as possible are isolated from
+right to left.
+Stripping stops when \fIe\fR has been stripped,
+nor is \fIe\fR stripped before a suffix beginning with \fIe\fR.
+Each suffix is marked by inserting \*v just before the first letter, or
+just after \fIe\fR in those suffixes that begin with \fIe\fR.
+.br
+.dt
+ \*-able \*-ably \*-e \*-ed \*-en
+.br
+ \*-er \*-ery \*-est \*-ful \*-ly
+.br
+ \*-ing \*-less \*-ment \*-ness \*-or
+.lp +5 0
+.ft I
+(care\*vful\*vly, maj\*vor, fine\*vry, state\*v, caree\*vr)
+.ft R
+.lp +5 5
+(7) If the word, exclusive of suffixes, ends in \fIi\fR or \fIy\fR,
+and contains no earlier vowel, then \fIi\fR or \fIy\fR is assumed long.
+.ft I
+(pY \fR(from pie),\fI crY\*ving, lIe\*vd)
+.ft R
+.lp +5 5
+(8) If the first suffix begins with one of [aeio],
+then the vowel [aeiouy] in an immediately
+preceding pattern [^aeo][aeiouy][^aeiouwxy]
+is assumed long.
+The digram \fIth\fR behaves as a single letter in this test.
+.ft I
+(cAre\*vful\*vly, bAthe\*vd, mAj\*vor, pOt\*vable, port\*vable)
+.ft R
+.lp +5 5
+(9) In these exceptional cases no long letter is assumed in the
+preceding step:
+.lp +10 5
+(i) before \fIg\fR, if there are any earlier vowels
+.ft I
+(postage\*v, stAge\*v, college\*v)
+.ft R
+.lp +10 5
+(ii) \fIe\fR is not long before \fIl\fR
+.ft I
+(travele\*vd)
+.ft R
+.lp +5 5
+(10) If the first suffix begins with one of [aeio],
+and the word exclusive of suffixes ends in
+[aeiouyAEIOUY]th, then
+digram \fIth\fR is capitalized.
+.ft I
+(breaTH\*ving, blITHe\*vly)
+.ft R
+.lp +5 5
+(11) An attempt is made to
+recognize silent \fIe\fR in the middle of compound words.
+Such an \fIe\fR is marked by a following \*v, and preceding vowels, other than
+\fIe\fR, are assumed long as in step (8).
+Silent \fIe\fR is marked in the context
+[bdgmnprst][bdgpt]le[^aeioruy\*v]S,
+where S is any
+string that contains [aeiouy] but does not contain \*v or the end of the word.
+Silent \fIe\fR is also marked in the context
+[^aeiu][aiou][^aeiouwxy]e[^aeinoruy]S.
+.ft I
+(simple\*vton, fAce\*vguard, cAve\*vman, cavernous)
+.ft R
+.s0
+.sh FILES
+/usr/lib/speak.m
+.sh "SEE ALSO"
+vs(VII), vs(IV)
+.sh DIAGNOSTICS
+`?' for unknown command with
+.bd !,
+or for
+unreadable or unwritable vocabulary file
+.sh BUGS
+Vocabulary overflow is unchecked.
+Excessively long words cause dumps.
+Space is not reclaimed from deleted entries.