SPEECHTO.PDF

(94 KB) Pobierz

Technology

Overview

ISD Embedded Speech

Recognition Technology

I S D

S P E E C H

R E C O G N I T I O N

S O L U T I O N

E N D L E S S

A P P L I C AT I O N

P O S S I B I L I T I E S

ISD, the company that set a new paradigm for high

quality voice record and playback, now energizes

speech-enabled consumer products with highly

accurate, small footprint, cost-effective automatic

speech recognition. The ISD solution includes a

turnkey embedded processor that enables rapid

development of new applications. A complete prod-

uct, including the speech recognition engine, the

peripheral hardware, and the voice user interface is

offered to the designer.

A cost-effective solution for voice command and

control allows consumers to grasp technology that

was previously available only by memorizing com-

plex keystrokes and reading detailed instructions.

Any application where the user’s eyes and hands are

busy is ideally suited to the ISD command and con-

trol ASR technology. Development tools provided

by ISD make application development fast and easy,

allowing for vocabulary generation, voice prompt

compilation, and user dialog design.

S P E E C H

R E C O G N I T I O N

AT T R I B U T E S

I D E A L

A P P L I C AT I O N S

•

Embedded speech recognition processor

—All recognition processing on-chip

•

Supports speaker-independent continuous

speech and digit input

•

Allows speaker-dependent programming for

voice macro command and voicetag storage

•

Independent users have secure phone books

through user-specified keywords

•

True hands-free voicetag storage

•

>99% accuracy for speaker-independent

commands and digits

•

Finite-state grammar with multiple topic

organization

•

Cordless and Feature Phones

— Voice activated dialer and voice controlled

answering device

•

Wireless handsets

— Voice activated dialer, voice macros, voice

control of on-the-air services

•

Automotive interior

— Hands-free car kits

— Accessory and navigation control

•

Instrumentation

— Hands-free front panel control

•

Consumer appliances

— Voice control over keypad functions

ISD Embedded Speech

Recognition Processors

B L O C K D I A G R A M F O R C O M P L E T E

R E C O G N I T I O N S Y S T E M

S P E E C H

T E C H N O L O G Y

C A P A B I L I T I E S

The technology allows for continuous speech and digit

System Control

input from independent speakers, as well as speaker

dependent user-programmed commands and name

storage. The vocabulary is flexible, allowing the design-

er to create custom vocabularies for a wide variety of

command and control applications. The ISD solution

supports multiple topics, allowing for real-time recog-

nition with high accuracy. The complete solution

includes the ASR engine (processor and firmware),

proprietary hardware optimized for speech processing,

and application-specific voice user interfaces (VUI).

8051

Microcontroller

(VUI)

Speaker

Micro-

phone

Voice

CODEC

ISD Speech

Recognition

Processor

Non-Volatile Memeory

(Voice tag storage,

acoustic models,

vocabulary)

R E C O G N I T I O N

T E C H N O L O G Y

C O M M A N D

A N D

C O N T R O L

S O L U T I O N

The ISD engine uses a segmented sub-phoneme recogni-

tion process. The sampled speech utterance is split into

distinct phonetic sounds, the smallest units of speech.

Because these phonemes vary in both sound and dura-

tion, the processor must be able to determine boundaries

between the sounds. Hidden Markov Models are used to

hypothesize boundaries between sounds and to form

probabilistic models on each possible combination.

The first step is feature extraction, where feature

vectors are calculated from LPC coefficients and cepstral

values for individual 10ms frames. The outputs are then

classified by determining matches between the feature

vectors and the stored phoneme models. The acoustic

models for the phonemes are gathered from a large

sample of speakers, allowing for a wide variation across

accents, dialect, and gender. This allows the recognizer to

associate the sound segments with a number of possible

phonemes, enabling recognition when words are

pronounced differently. The models are phonemes in

context, which consist of phonemes along with transi-

tional information to other phonemes

ISD’s speech recognition technology is ideally suited

for command and control applications for consumer,

communications, and instrumentation products that

have complex interfaces. The traditional keypad and

display interface can be completely replaced by an

effective voice interface. The combination of ISD’s

speaker independent commands along with speaker

dependent programming allows for both standard and

custom control of products. Users can grasp more tech-

nology by replacing complex key strokes and codes

with simple voice commands. The ISD solution allows

for users to program unique voice-based macros in

addition to standard commands. The VUI dialogs

created by ISD makes instant success a reality by pro-

viding an easy to understand natural interface with

context-sensitive help prompts.

ISD is a registered trademark of ISD.

Printed in the U.S.A. ISDSRIS1-599

To Order Products or More Information:

A D D R E S S

P H O N E

2727 N. First Street

San Jose, CA 95134

1-800-677-0769 (US Only)

408-943-6666

408-544-1786 (Fax)

e - m a i l

W E B S I T E

www.isd.com

info@isd.com

Plik z chomika:

Kot_Maciek

Inne pliki z tego folderu:

SPEECHTO.PDF (94 KB)

SPEECHTO.PDF

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: