AT&T Natural Voices® Speech Engine for Vox Proxy!


Right Seat Software is proud to offer AT&T Labs' Natural Voices Text-To-Speech engines for Vox Proxy.

The AT&T Natural Voices TTS Engine is the most realistic, human-sounding synthetic speech system available on the market today. From the first synthesized speech system in 1939 to today, AT&T has long been the world's speech technology pioneer, introducing the highest quality and most robust speech systems available anywhere. Now, Right Seat Software is bringing Natural Voices to Vox Proxy!

Individual voices are available in U.S. English, German, Latin American Spanish, U.K. English, Parisian and Canadian French. Listen to samples of all the available voices by clicking on one of the buttons below.

 

Listen to the standard voices included with Vox Proxy

Paul (Tru-Voice)
Susan (Microsoft female voice)
 

Listen to samples of AT&T Natural Voices

Voices come in both 8KHz and 16KHz versions. The 16KHz voices are much clearer, but the files are much bigger (see notes below)

 

Voice

8KHz

16KHz

  Mike (U.S. English)
  Rich (U.S. English)
  Mel (U.S. English)
  Ray (U.S. English)
  Crystal (U.S. English)
  Claire (U.S. English)
  Julia (U.S. English)  
  Lauren (U.S. English  
  Anjali British English (India)  
  Audrey (U.K. English)
  Charles (U.K. English)
  Rosa (Latin-American Spanish)
  Alberto (Latin-American Spanish)
  Klara (German)
  Reiner (German)
  Alain (Parisian French)
  Juliette (Parisian French)
  Arnaud (French Canadian)
These samples are "wav" files that were created using the actual installed speech engines.
 

Ordering Information

Click here to order now. You must be an owner or new purchaser of Vox Proxy to order these speech engines. Be sure to read the limitations below before ordering.

To get started, you must order the Natural Voices Engine, which comes with the Mike and Crystal voices. You may add to that any combination of available voices in any languages. Each additional voice is shipped on a separate installable CD. Because of the larges file sizes, these voices are not available as downloads.


Prices:

Natural Voices Engine, with Mike and Crystal 8KHz voices
  $42.00  
Natural Voices Engine, with Mike and Crystal 16KHz voices   $47.00  
Each additional 8KHz voice (discounts for more than one)   $20.00  
Each additional 16KHz voice (discounts for more than one)   $25.00  
Engine with all 8 U.S. English voices at 8KHz   $136.80 Limited-time offer
Engine with all 8 U.S. English voices at 16KHz   $165.50 Limited-time offer
Engine with all 18 voices at 8KHz   $232.80 Limited-time offer
Engine with all 18 voices at 16KHz   $285.50 Limited-time offer

Minimum System Requirements:

  • Windows 98se, Me, 2000, or XP
  • 512MB
  • Pentium-III or better; 500MHz or greater CPU speed
  • 500MB hard disk space for the Speech Engine with Mike and Crystal 8KHz
  • 1.5GB hard disk space for the Speech Engine with Mike and Crystal 16KHz
  • Approximately 250MB hard disk space for EACH ADDITIONAL 8KHz VOICE
  • Approximately 700MB hard disk space for EACH ADDITIONAL 16KHz VOICE
  • Windows XP must be set for Best Performance (Control Panel/System/Advanced/Performance/Settings)

Using Natural Voices in Vox Proxy

Characters can speak using Natural Voices in any of four ways:

  1. Voice= option on the "Show character" wizard. Select from the list of voices available. This voice applies only until the next time you SHOW the character.
  2. TTSEngine command on the Wizard's "Other Character Commands" list. This command applies only until the next time you SHOW the character.
  3. SayWav command using wav files generated from Natural Voices. See documentation for the SayWav command and for the feature "Convert Speech to WAV Files" on the Script Writer tools menu. PLEASE NOTE that you must purchase the AT&T Natural Voices Engine from us in order to use our automatic wav-file generator. If purchased from another vendor, this feature will not be available.
  4. Character Properties. Specify a Natural Voices TTS engine from the selection list. This becomes the character's default voice.

The names of these voices come from AT&T, but they can be used with any Vox Proxy characters.


LIMITATIONS

The AT&T Natural Voices TTS engine has several limitations that you should read carefully and understand.

  1. You may not redistribute these TTS engines. That means you are not permitted to burn disks containing the Natural Voices TTS engine. You may, however, create audio (WAV) files using the TTS engines and embed those on the CD, using Vox Proxy's "SayWav" command. PLEASE NOTE that you must purchase the AT&T Natural Voices Engine from us in order to use our automatic wav-file generator. If purchased from another vendor, this feature will not be available.
  2. The files for Natural Voices are BIG (see system requirements above). Make sure you have space on your hard drive for them. 16KHz voices can be 3 to 4 times the size of 8KHZ. Each 8KHz voice requires about 1/4 Gigabytes of disk space. Each 16KHz voice requires 3/4 to 1 GB.
  3. This speech engine is demanding of performance. Be sure you have at least 256MB of RAM. The engine does not work well on slides with videos, animated GIFs, or other CPU-consuming tasks. Be sure to set Windows for maximum performance (Control Panel/System/Advanced/Performance/Settings "Adjust for best performance").
  4. The first time you use the TTS engine in a given session, it can take as much as 30 seconds to load. This is unavoidable, but for a live presentation, you can preview a slide prior to starting the actual slide show so that the extra load time is completed before going live.
  5. Natural voices does not support many of the "speech tags" used in Vox Proxy to modify the pronunciation of text. Specifically, the emphasis, pitch, whisper, and monotone tags are not supported. In addition, the engine does not generally modify its pronunciation according to closing punctuation. Speech tags that ARE supported include: address context, email context, pause, reset, and volume. The result is that you do not have as much flexibility in modifying the pronunciation of sentences as you do using Vox Proxy's default (Tru-Voice) speech engine.