Section
 

Challenge

TALK to ME!

In classical Greek mythology, Euterpe was one of the nine Muses, the Goddess of inspiration, learning and the arts. The name Euterpe means "well pleasing" and in more recent history the name has been bestowed on objects ranging from flowers to battleships. However, for electronic engineers, "Euterpe" is about to take on an entirely different meaning. It is the name chosen by ST for a new chip family that simplifies the task of adding speech recognition, text-to-speech and other key voice-related functions to automotive systems, household appliances, instrumentation and many other applications that will benefit from the most natural of all man-machine interfaces.

digital voice

In fact, ST's Euterpe digital voice processor is the most compact automotive-grade solution for adding voice processing capabilities. Housed in a TQFP80 package, the chip is built around a powerful, 24-bit, 50MIPs DSP core optimized for audio applications. The chip also contains 16Kwords (48KBytes) of ROM for the DSP code, 48Kbytes of data RAM, a 16-bit sigma-delta stereo CODEC, a memory manager for up to 32Mbits of external Flash memory and an I2C serial interface. The chip is controlled through the serial bus by the system microcontroller.

Euterpe was designed to provide a highly integrated but flexible system-on-chip solution to a wide range of voice-related applications and the use of a programmable, optimized DSP core is the key to both Euterpe's high performance and its versatility.

Using proven DSP algorithms developed by third parties that are acknowledged leaders in their fields, Euterpe can perform speech recognition, text-to-speech voice synthesis, speaker verification functions, noise cancellation, echo cancellation and many other voice processing functions. In addition, the large program memory means that system developers can also add their own proprietary code to achieve product differentiation without losing the benefits of building their system around a cost-effective standard product.

euterpe


THE ULTIMATE MAN-MACHINE INTERFACE?

The ever-decreasing cost of embedded processors and memories is continually widening the range of appliances and other systems that can be given a high degree of "intelligence" and in an ever increasing number of cases the problem is not the cost of adding the intelligence but the human interface.

For example, many VCRs allow the user to set up sophisticated recording schedules spread over multiple channels on different days. Few users, however, actually take the time and trouble to step through the command sequences needed to do this. Similarly, many central heating controllers allow the homeowner to program different heating cycles for weekdays and weekends but again, only a minority of users take advantage of this facility.

The problem with these and other intelligent appliances is that each has its own set of buttons to press and sequences to learn. What is needed is a single command language that will work for all appliances.

It has long been recognized that the ideal command language is ordinary human speech but until very recently this has not been a practical solution.

This has been due partly to technical and economic factors such as the cost and physical size of the computational resources needed. Partly, also, the need for sophisticated algorithms that can reliably analyze and interpret spoken words under real-world conditions such as background noise or echoes.

In recent years, there has been enormous progress on both fronts. Leading chip manufacturers such as ST have developed cost-effective system-on-chip technology and highly effective algorithms for speech recognition, noise cancellation and other voice processing functions have been developed by specialists in this field.

Euterpe brings both of these developments together to provide a single chip solution that makes highly sophisticated voice input/output cost-effective and simple to develop. For example, Euterpe can be supplied with proven noise or echo cancellation routines developed by NCT Group, Inc. that provide continuous and adaptive removal of background noise from speech or eliminate the characteristic annoying echo of handsfree phones.

ST has also worked with leading speech recognition experts Lernout & Hauspie, whose state-of-the-art speech recognition software supports both speaker independent and speaker dependent modes, has multi-language support and is specially optimized for noisy environments.

Euterpe can be supplied with Lernout & Hauspie's ASR311 automatic speech recognition software. It features speaker-independent recognition of up to 450 words, speaker-dependent recognition via a training phase, quasi-connected digit recognition and multi-language capability. UK English, US English, German, French, Italian, Spanish and Japanese are initially supported. In addition, the software engine is language independent, with all language-dependent data residing in external Flash memory.

Euterpe can also provide reliable, accurate speaker verification and very high quality text-to-speech conversion, again using Lernout & Hauspie software. Speaker verification provides accurate identification of the speaker by comparing the voice password with a previously analyzed sample, enhancing the security of physical or logical access control. The text-to-speech function converts ASCII text strings into high-quality vocal messages, allowing facilities such as listening to email messages while driving or providing confirmation of spoken commands. The text-to-speech function is shared between the Euterpe chip, which handles language-independent functions, and the host microcontroller, which handles the language dependent part.

SHARPE

The advantages of voice processing are many and various and just one example of what will soon be possible will leave no doubt that this is, indeed, the future of the man-machine interface.

Imagine you are driving home, eagerly looking forward to watching a vital football match on your television.

You hit a major traffic jam and realize you will not get home until long after the match starts. No panic! You tell your car phone to send your VCR the message Record The Football. Your car phone understands that it has to send the specified message to your VCR and does so. Your VCR understands Record and Football, checks its electronic TV guide, identifies the program you want and sends a message back to you to confirm it will record it.

forno

When you arrive home, the match still has ten minutes to go so you decide to wait until the end and then watch the whole game.

You go to your kitchen and tell your freezer that you would like something Indian. It tells you that there is a Chicken Tikka in the third drawer so you remove it and read the cooking instructions. "Cook on medium heat for eight minutes, leave to stand for two minutes and then cook on full power for one minute", you say to your microwave.

"OK", it replies. By the way, the telephone says it's got three messages for you. Do you want to hear them now or after the match?"


title


Should you require more information, please select the appropriate contact from the "Related Topics" menu.