|
|
This program demonstrates using Microsoft's SAPI 5.1 Speech SDK, along with
TeleTools, to create an application that converts Text To Speech for use in
Telephony applications. Useful applications of this technology include
IVRs, voicemail and situations where a wave message cannot be pre-recorded or
constructed reasonably by concatenating wave files using etPlay. In addition,
this sample program demonstrates using XML tags in the text to control how the
speech is sent, and responds to DTMF digits pressed on the telephone.
To make this program compile on your machine, be sure to have the etLine, etPlay and etRecord components on your form and to click the
"AboutLoadSerialnumber" property to install your serial number into the
components. Then click on the enabled property to set it to true. You must
compile this program with a purchased serial number or it will operate with
reminder screens.
You also need to have the Microsoft Speech SDK version 5.1 installed according
to the instructions on our website. We are using SpVoice,
SpObjectToken and SpMMAudioOut functions from the SDK.
When the device selected in the ComboBoxDevice list is not active, you can
scroll through the "audio output" devices to find your sound card and test
speech without having to make a phone call. Choose a wave format from it's
combobox, and a voice you would like to use from the Voice combobox and then
use the play, stop and pause controls at the bottom of the form to control the
spoken text.
Next, find your telephony device in the device list and check the "active"
checkbox. You should see the device selected and activated in the "call progress"
window. the Audio Output combobox will gray out since we automatically select the proper wave device that is associated with your telephony line
device.
You must select the proper native wave file format for your device in the wave format combobox. A message box and entry in the log will notify you
if the device cannot handle that format. For most devices, 8kHz16BitMono
will work, however, some devices only support another format or support more
than one format. Dialogic cards, for example, will only support 11kHz8BitMono,
which is a slightly better sampling rate. Hi-Phone's support 8kHz16BitMono,
11kHz8BitMono, and 11kHz16BitMono.
You may even be able to play other wave formats because of audio codecs installed on your machine. While this can be a nice feature, you must be
careful because you cannot depend on that CODEC being installed on another
person's machine. In addition, it is much better to play in the native format
of the device, using it's hardware and firmware to decode the wave file instead
of depending on the computers resources to handle something as processor intensive as wave streams. This can lead to dropouts if the CPU gets busy or
hardware interrupts interfere with the CODEC process.
Once you have the wave format selected and the device activated, you can place
a number into the phone number window and click "Dial". All of the call progress
events and helpful info about TeleTools methods and properties being accessed
will appear in the call progress log window.
As soon as you answer the call (and for devices like Dialogic cards and Hi-Phones
that give positive voice detection, say "hello"), you will see the "OnConnected"
event trigger and the speech code run in this event handler will immediately
start playing the text message typed into the speech text window. You can
use the control buttons at the bottom of the window to pause, stop, re-start
and play the message again. You can also press digits on your phone and watch
the log and listen to the phone call to see and hear DTMF digit events processed
and messages played telling you which digit you pressed.
You may place incoming calls to the line attached to your device. You will see the "OnOffering" event fire signifying an incoming call, get ring
events and CalerID events, and the hangup and Answer buttons will become active.
You can now click the Answer button to answer the call and the caller will
immediately hear the greeting.
When you are finished with a call, hang up the phone or press the Hangup button.
If the remote caller hangs up, we automatically issue the hangup command on
your end.
We recommend that for learning, testing and debugging, you click on the "TeleScope" button to see how powerful a tool it is for viewing almost
everything going on in your program. Please read the help file sections about
TeleScope, it may be the most valuable first advice we can provide to you.
You may also place the line, "etLine1.TeleScopeActive = True" anywhere in your
code where you want TeleScope to popup. It's a great idea to include a secret
hotkey combination in your code to enable the TeleScope for your components so
that you or your clients can email informative logs.
Place outgoing calls and watch what events and in which order they fire. Place
incoming calls and watch the OnRing and CallerID events fire. Then all you
have to do is place your code in the correct event handler. Not all devices
behave exactly the same. A modem is different from a voice modem which is different from a true TAPI compliant device. Please see our article on our
web page and in the help file titled, "Working With Modems" for examples of
things you may need to consider when programming your application.
Some devices can handle multiple calls, so good programming practice suggests
that you reference the call handle in your routines by using the line: etLine1.CallHandle = CallHandle. If you need explicit control of multiple calls
you would create a global variable to reference the correct call where necessary.
In addition, some devices, and even different versions of devices from the
same manufacturer may have issues related to speech and/or wave files. For
example, some Dialogic cards must have a buffer setting change to minimize
a stuttering effect that can happen when they can't process the wave data properly.
NOTE: We have written this program to show you some of the things TeleTools can
do and to show many of our methods and properties. We have provided help on how you can implement speech using Microsoft SAPI yourself. We cannot
support Microsoft products, but can provide consulting to help you with questions regarding your telephony projects if you need it.
OTHER THINGS TO TRY
Use the rate and volume sliders to see how it effects the spoken text and look
over the source code to see which features can help you in your application.
Type your own text in the text window and look over the XML tags shown inour sample to see how you can alter volume, rate, voice, pitch and more to
have more control over how your application sounds to users.
Click on the Line Config button to see if your device supports this feature
to call up it's TSP configuration screen. Likewise, click on the TeleScope
button to see what is happening behind the scenes and to use other methods
and properties not written into this sample program. TeleScope allows you to take control over your device.
WHAT TO DO NEXT
Read the Microsoft SAPI SDK help file to gain the most from Speech. You can
implement Speech Recognition in the same way you implement TTS. View our web
tutorial on Speech and follow the links to learn more about other speech engines you can use in addition to the free ones included by Microsoft. Some
of the 3rd party speech engines, such as the one from AT&T sound like a real
person.
Try our other sample EXE programs, available on our web site, and then download
the sample source code if you want to find out how easy it is to use TeleTools
to greatly simplify your telephony programming experience. We have other Speech sample programs that can give you more help if you need it. Use our
"Premium Trial" to test our tools and gain the most from your evaluation. And remember to read the help section on using TeleScope and use it to learn,
test, prototype and debug amost anything telephony related you are trying to do.
DISCLAIMER
ExceleTel only provides the SAPI samples as a guide to to help our clients
start the learning process to incorporate speech into their own applications.
We cannot provide free support for Microsoft's SAPI, nor warranty that it will
work in any particular application or with any particular hardware.
NOTE ABOUT VOLUME (especially Dialogic users)
Some devices have issues with volume. In particular,
Dialogic has a bug in their wave driver which does not report the volume
level properly, it is always zero (0). Therefore, making sure you
put the following lines into your application will make sure to not save
and reset a default volume which can wind up making your audio so low you
can't hear it. In addition, you may need to use the little trick
below to set the volume if it gets set too low:
etPlay1.VolumeEnabled
= True
etPlay1.VolumeDefault = False
etPlay1.VolumeReset = False
etPlay1.VolumePosition = 75
Then, in order to actually set the volume since you aren't using etPlay
to play a wave file, but instead using SAPI to play text-to-speech, you
must create a very small wave file containing silence. We've included one
on this page for you to use. Then on a connected call, use the following
code to send this wave file to your hardware to reset it's default volume:
etPlay1.SourceFileName =
"10msof11kHzSilence.wav"
etPlay1.DeviceActive = True
Silence10ms11k8bitMono
- 10ms of silence at 11kHz, 8-bit, Mono (Dialogic)
Silence10ms8k16bitMono - 10ms of
silence at 8kHz, 16-bit, Mono (voicemodems)
|