SPEECHTO.PDF
(
94 KB
)
Pobierz
Technology
Overview
ISD Embedded Speech
Recognition Technology
I S D
S P E E C H
R E C O G N I T I O N
S O L U T I O N
E N D L E S S
A P P L I C AT I O N
P O S S I B I L I T I E S
ISD, the company that set a new paradigm for high
quality voice record and playback, now energizes
speech-enabled consumer products with highly
accurate, small footprint, cost-effective automatic
speech recognition. The ISD solution includes a
turnkey embedded processor that enables rapid
development of new applications. A complete prod-
uct, including the speech recognition engine, the
peripheral hardware, and the voice user interface is
offered to the designer.
A cost-effective solution for voice command and
control allows consumers to grasp technology that
was previously available only by memorizing com-
plex keystrokes and reading detailed instructions.
Any application where the user’s eyes and hands are
busy is ideally suited to the ISD command and con-
trol ASR technology. Development tools provided
by ISD make application development fast and easy,
allowing for vocabulary generation, voice prompt
compilation, and user dialog design.
S P E E C H
R E C O G N I T I O N
AT T R I B U T E S
I D E A L
A P P L I C AT I O N S
•
Embedded speech recognition processor
—All recognition processing on-chip
•
Supports speaker-independent continuous
speech and digit input
•
Allows speaker-dependent programming for
voice macro command and voicetag storage
•
Independent users have secure phone books
through user-specified keywords
•
True hands-free voicetag storage
•
>99% accuracy for speaker-independent
commands and digits
•
Finite-state grammar with multiple topic
organization
•
Cordless and Feature Phones
— Voice activated dialer and voice controlled
answering device
•
Wireless handsets
— Voice activated dialer, voice macros, voice
control of on-the-air services
•
Automotive interior
— Hands-free car kits
— Accessory and navigation control
•
Instrumentation
— Hands-free front panel control
•
Consumer appliances
— Voice control over keypad functions
ISD Embedded Speech
Recognition Processors
B L O C K D I A G R A M F O R C O M P L E T E
R E C O G N I T I O N S Y S T E M
S P E E C H
T E C H N O L O G Y
C A P A B I L I T I E S
The technology allows for continuous speech and digit
System Control
input from independent speakers, as well as speaker
dependent user-programmed commands and name
storage. The vocabulary is flexible, allowing the design-
er to create custom vocabularies for a wide variety of
command and control applications. The ISD solution
supports multiple topics, allowing for real-time recog-
nition with high accuracy. The complete solution
includes the ASR engine (processor and firmware),
proprietary hardware optimized for speech processing,
and application-specific voice user interfaces (VUI).
8051
Microcontroller
(VUI)
Speaker
Micro-
phone
Voice
CODEC
ISD Speech
Recognition
Processor
Non-Volatile Memeory
(Voice tag storage,
acoustic models,
vocabulary)
R E C O G N I T I O N
T E C H N O L O G Y
C O M M A N D
A N D
C O N T R O L
S O L U T I O N
The ISD engine uses a segmented sub-phoneme recogni-
tion process. The sampled speech utterance is split into
distinct phonetic sounds, the smallest units of speech.
Because these phonemes vary in both sound and dura-
tion, the processor must be able to determine boundaries
between the sounds. Hidden Markov Models are used to
hypothesize boundaries between sounds and to form
probabilistic models on each possible combination.
The first step is feature extraction, where feature
vectors are calculated from LPC coefficients and cepstral
values for individual 10ms frames. The outputs are then
classified by determining matches between the feature
vectors and the stored phoneme models. The acoustic
models for the phonemes are gathered from a large
sample of speakers, allowing for a wide variation across
accents, dialect, and gender. This allows the recognizer to
associate the sound segments with a number of possible
phonemes, enabling recognition when words are
pronounced differently. The models are phonemes in
context, which consist of phonemes along with transi-
tional information to other phonemes
ISD’s speech recognition technology is ideally suited
for command and control applications for consumer,
communications, and instrumentation products that
have complex interfaces. The traditional keypad and
display interface can be completely replaced by an
effective voice interface. The combination of ISD’s
speaker independent commands along with speaker
dependent programming allows for both standard and
custom control of products. Users can grasp more tech-
nology by replacing complex key strokes and codes
with simple voice commands. The ISD solution allows
for users to program unique voice-based macros in
addition to standard commands. The VUI dialogs
created by ISD makes instant success a reality by pro-
viding an easy to understand natural interface with
context-sensitive help prompts.
ISD is a registered trademark of ISD.
Printed in the U.S.A. ISDSRIS1-599
To Order Products or More Information:
A D D R E S S
P H O N E
2727 N. First Street
San Jose, CA 95134
1-800-677-0769 (US Only)
408-943-6666
408-544-1786 (Fax)
e - m a i l
W E B S I T E
www.isd.com
info@isd.com
Plik z chomika:
Kot_Maciek
Inne pliki z tego folderu:
SPEECHTO.PDF
(94 KB)
Inne foldery tego chomika:
CHIPCORD
VOICEDSP
WELCOME
Zgłoś jeśli
naruszono regulamin