Research V4 development

author Ken Thompson <ken@research.uucp>

Tue, 6 Nov 1973 00:27:00 +0000 (19:27 -0500)

committer Ken Thompson <ken@research.uucp>

Tue, 6 Nov 1973 00:27:00 +0000 (19:27 -0500)
author Ken Thompson <ken@research.uucp>
Tue, 6 Nov 1973 00:27:00 +0000 (19:27 -0500)
committer Ken Thompson <ken@research.uucp>
Tue, 6 Nov 1973 00:27:00 +0000 (19:27 -0500)
diff --git a/man/man1/speak.1 b/man/man1/speak.1

new file mode 100644 (file)

index 0000000..e9d01a8
--- /dev/null
+++ b/man/man1/speak.1
@@ -0,0 +1,251 @@
+.th SPEAK I 8/15/73
+.if t .ds A \o"a\(ga"
+.if n .ds A a\b`
+.if t .ds v \|\(bv
+.sh NAME
+speak \*- word to voice translator
+.sh SYNOPSIS
+.bd speak
+[
+.bd \*-epsv
+] [ vocabulary [ output ] ]
+.sh DESCRIPTION
+.it Speak
+turns a stream of words
+into utterances and outputs them to a voice synthesizer,
+or to a specified output file.
+It has facilities for maintaining a vocabulary.
+It receives, from the standard input 
+.s3
+.lp +5 3
+\*-    working lines: text of words separated by blanks
+.lp +5 3
+\*-    phonetic lines: strings of phonemes for one word preceded
+and separated by commas.
+The phonemes may be followed by comma-percent then a `replacement
+part' \*- an ASCII string with no spaces.
+The phonetic code is given in vsp(VII).
+.lp +5 3
+\*-    empty lines
+.lp +5 3
+\*-    command lines: beginning with
+.bd !.
+The following command lines
+are recognized:
+.s3
+.lp +15 10
+\fB!r\fR file  replace coded vocabulary from file
+.lp +15 10
+\fB!w\fR file  write coded vocabulary on file
+.lp +15 10
+\fB!p\fR       print parsing for working word
+.lp +15 10
+\fB!l\fR       list vocabulary on standard output with phonetics
+.lp +15 10
+\fB!c\fR word  copy phonetics from working word to
+specified word
+.lp +15 10
+\fB!d\fR       print phonetics for working word
+.s3
+.i0
+Each working line replaces its predecessor.
+Its first word is the `working word'.
+Each phonetic line replaces the phonetics stored for the
+working word.
+In particular, a phonetic line of comma only deletes the
+entry for the working word.
+Each working line, phonetic line or empty line
+causes the working line to be uttered.
+The process terminates at the end of input.
+.s3
+Unknown words are pronounced by rules, and failing that,
+are spelled.
+Spelling is done by taking each character of
+the word, prefixing it with *, and looking it up.
+Unspellable words burp.
+.s3
+.it Speak
+is initialized with a coded vocabulary stored in file
+/usr/lib/speak.m.
+The vocabulary option substitutes a different file for
+/usr/lib/speak.m.
+.s3
+A set of single letter options may
+appear in any order preceded by
+.bd \*-.
+Their meanings are:
+.s3
+.lp +8 4
+\fB\*-e\fR     suppress English steps (4\*-8) below
+.lp +8 4
+\fB\*-p\fR     suppress pronunciation by rule
+.lp +8 4
+\fB\*-s\fR     suppress spelling
+.lp +8 4
+\fB\*-v\fR     suppress voice output
+.s3
+.i0
+.ne4
+The steps of pronunciation by rule are:
+.s3
+.lp +5 5
+(1)    If there were no lower case letters in the working line,
+fold all upper case letters to lower.
+.lp +5 5
+(2)    Fold an initial cap to lower case, and try again.
+.lp +5 5
+(3)    If word has only one letter, or has no lower case vowels, quit.
+.lp +5 5
+(4)    If there is a final 
+.it s,
+strip it.
+.lp +5 5
+(5)    Replace final \*-ie by \*-y.
+.lp +5 5
+(6)    If any changes have been made, try whole word again.
+.lp +5 5
+(7)    Locate probable long vowels and capitalize them.
+Mark probable silent \fIe\fR's.
+.lp +5 5
+(8)    Put back the 
+.it s
+stripped in (4), if any.
+.lp +5 5
+(9)    Place # before and after word.
+.lp +5 5
+(10)   Prefix word with 
+.it %,
+and look up longest initial match
+in the stored table of words; if none, quit.
+.lp +5 5
+(11)   Use phonemes from the stored phonetic string as
+pronunciation, and replace the matched stuff by the
+replacement part of the phonetic string.
+.lp +5 5
+(12)   If anything remains, go to (10).
+.s3
+.i0
+Long vowels are located this way in step (7):
+.s3
+.lp +5 5
+(1)    A \fIu\fR appearing in context
+[^aeiou]u[^aeiouwxy][aieouy].
+(The notation is just a regular expression \*A la ed(I).)
+.ft I
+(pustUlous)
+.ft R
+.lp +5 5
+(2)    One of [aeo] appearing in the context 
+[aeo][^aehiouwxy][ie][aou]
+or in the context [aeo][^aehiouwxy]ien is assumed long.
+The digram \fIth\fR behaves as a single letter in this test.
+.ft I
+(rAdium, facEtious, quOtient, carpAthian)
+.ft R
+.lp +5 5
+(3)    If the first vowel in the word is \fIi\fR followed by one of
+\fIaou,\fR it is assumed long.
+.ft I
+(Iodine, dIameter, trIumph)
+.ft R
+.lp +5 5
+(4)    If the only vowel in the word is final \fIe,\fR the vowel is
+assumed long.
+.ft I
+(bE, shE)
+.ft R
+.lp +5 5
+(5)    If the only vowels in the word appear in the pattern
+[aeiouy][^aeiouwxy]S, where S is one of the suffixes
+.br
+.dt
+       \*-al   \*-le   \*-re   \*-y
+.br
+.lp +5 0
+then the first vowel is assumed long.
+.ft I
+(glObal, tAble, lUcre, lAdy)
+.ft R
+.lp +5 5
+(6)    If no suffix was found in (5),
+as many of these suffixes as possible are isolated from
+right to left.
+Stripping stops when \fIe\fR has been stripped,
+nor is \fIe\fR stripped before a suffix beginning with \fIe\fR.
+Each suffix is marked by inserting \*v just before the first letter, or
+just after \fIe\fR in those suffixes that begin with \fIe\fR.
+.br
+.dt
+       \*-able \*-ably \*-e    \*-ed   \*-en
+.br
+       \*-er   \*-ery  \*-est  \*-ful  \*-ly
+.br
+       \*-ing  \*-less \*-ment \*-ness \*-or
+.lp +5 0
+.ft I
+(care\*vful\*vly, maj\*vor, fine\*vry, state\*v, caree\*vr)
+.ft R
+.lp +5 5
+(7)    If the word, exclusive of suffixes, ends in \fIi\fR or \fIy\fR,
+and contains no earlier vowel, then \fIi\fR or \fIy\fR is assumed long.
+.ft I
+(pY \fR(from pie),\fI crY\*ving, lIe\*vd)
+.ft R
+.lp +5 5
+(8)    If the first suffix begins with one of [aeio],
+then the vowel [aeiouy] in an immediately
+preceding pattern [^aeo][aeiouy][^aeiouwxy]
+is assumed long.
+The digram \fIth\fR behaves as a single letter in this test.
+.ft I
+(cAre\*vful\*vly, bAthe\*vd, mAj\*vor, pOt\*vable, port\*vable)
+.ft R
+.lp +5 5
+(9)    In these exceptional cases no long letter is assumed in the
+preceding step:
+.lp +10 5
+(i)    before \fIg\fR, if there are any earlier vowels
+.ft I
+(postage\*v, stAge\*v, college\*v)
+.ft R
+.lp +10 5
+(ii)   \fIe\fR is not long before \fIl\fR
+.ft I
+(travele\*vd)
+.ft R
+.lp +5 5
+(10)   If the first suffix begins with one of [aeio],
+and the word exclusive of suffixes ends in 
+[aeiouyAEIOUY]th, then
+digram \fIth\fR is capitalized.
+.ft I
+(breaTH\*ving, blITHe\*vly)
+.ft R
+.lp +5 5
+(11)   An attempt is made to
+recognize silent \fIe\fR in the middle of compound words.
+Such an \fIe\fR is marked by a following \*v, and preceding vowels, other than
+\fIe\fR, are assumed long as in step (8).
+Silent \fIe\fR is marked in the context 
+[bdgmnprst][bdgpt]le[^aeioruy\*v]S,
+where S is any
+string that contains [aeiouy] but does not contain \*v or the end of the word.
+Silent \fIe\fR is also marked in the context
+[^aeiu][aiou][^aeiouwxy]e[^aeinoruy]S.
+.ft I
+(simple\*vton, fAce\*vguard, cAve\*vman, cavernous)
+.ft R
+.s0
+.sh FILES
+/usr/lib/speak.m
+.sh "SEE ALSO"
+vs(VII), vs(IV)
+.sh DIAGNOSTICS
+`?' for unknown command with 
+.bd !,
+or for
+unreadable or unwritable vocabulary file
+.sh BUGS
+Vocabulary overflow is unchecked.
+Excessively long words cause dumps.
+Space is not reclaimed from deleted entries.
author	Ken Thompson <ken@research.uucp>
	Tue, 6 Nov 1973 00:27:00 +0000 (19:27 -0500)
committer	Ken Thompson <ken@research.uucp>
	Tue, 6 Nov 1973 00:27:00 +0000 (19:27 -0500)