Introduction

This document describes the resource management questions that you must consider when designing speech applications. As you will see below the Speech Channels in the Macintosh implementation can consume as much as several megabytes of memory and use up to half a dozen threads each. Clearly, you must think about resource management when dealing with these objects, regardless of whether you choose to program to the Java Speech API or the JSpeech Lib API.

Multiple Channels

The Speech Synthesiser being used is the Macintosh Speech Manager (AKA MacinTalk) and as per most synthesisers which have JSAPI interfaces this is a native implementation; not all of the resources allocated exist within the direct control of the garbage collected Java memory management model. Accordingly the first piece of advice must be:

  • Always Clean up after yourself. Understand how to call Synthesizer.deallocate() [JSAPI] or SpeechChannel.dispose() [JSpeechLib]

The next point is very similar – avoid allocation multiple SpeechChannel [JSpeechLib] or Synthesizer [JSAPI] objects at the same time. Your pattern should be to clean up (dispose/deallocate) one Channel/Synthesizer before creating another. Most of the time this is relatively simple to do, since the user can only usually be listening to one voice speaking at a time, most applications can survive with perhaps one or two "main" voices and a second or third voice used to chime in from time to time. If you must use more voices than this, you will either need to set higher system requirements, or follow the pattern suggested above where one Synthesiser or Channel is deallocated or disposed before another is created.

JSpeechLib programmers should be aware of one feature of SpeechChannel.setVoice() that may have an impact on how many channels are allocated. In short, calling setVoice() may cause two channels to be allocated at the same time, for a short period while the voice is changed over. You should factor this into your resource calculations.

To understand why setVoice() works this way, you need to appreciate that the modern Macintosh SpeechManager amalgamates several synthesisers, MacinTalk 3, MacinTalk Pro, and on systems that have them installed the Spanish and two Chinese synthesisers. Please be careful not to confuse these synthesisers with JSAPI Synthesiser objects; these synthesisers are a Macintosh concept and exist below the level of the code that application programmers usually have to deal with. However, any given Macintosh SpeechManager (and by extension JSpeechLib SpeechChannel) might currently be using a Voice which is from MacinTalk Pro or MacinTalk3 (or is Spanish or Chinese). When JSpeechLib's SpeechChannel.setVoice() method is called it will attempt to change the voice being used on the channel directly, which will only work if both Voices are from the same Macintosh synthesiser. e.g. A change from Victoria to Bruce would work as these are both MacinTalk Pro voices, but a change from Fred to Bruce would not, as Fred is a MacinTalk 3 voice. In this circumstance JSpeechLib will still make the change for you, but in order to do it it needs to use a more complex approach where it creates a new native SpeechChannel and switches it for the existing one behind the scenes. The details of the implementation require that it temporarily have both the old and the new channel allocated at the same time.

Therefore your application has three strategies open to it:

  • Allow enough resources to make sure that setVoice() calls can succeed even if two channel must temporarily exist.
  • Make sure setVoice() is only ever used to change the voice between two voices which are both from the same Macintosh Synthesiser.
  • Do not use setVoice(), instead call dispose() on the channel you no longer need, and create a new one for the voice you want to switch to.

If this sounds horribly complex, take a moment to look at a simple JSpeechLib example. This shows how to speak a phrase with the speakString method using only one line of code. You do not even have to create a SpeechChannel object. This method has some limitations, in that speakString can only speak 255 characters, and uses the voice the user selects in the Speech Control Panel, but it is very easy to add simple speech capabilities to your application. The second slightly more complex example creates a SpeechChannnel and uses speakText. This method has neither of the 2 limitations that speakString displays.

Memory Issues

This section discusses various technical issues relating to allocating memory when speaking.

Memory management in Java

Although the JSpeechLib library attempts to simplify managing the memory used by speech channels, it is still a very good idea to call dispose() on a channel as soon as you are done with it. JSAPI speech Synthesisers are similar (actually each uses one or more JSpeechLib SpeechChannels). Anything said of SpeechChannel and dispose() applies also to Synthesizer objects and deallocate. The SpeechChannel class has a finalizer method that calls dispose() for you, so the resources will eventually be freed by the garbage collector, but this is likely to be much later than you might like.

Each native SpeechChannel can take up to several megabytes of memory if you use high quality MacinTalk Pro voices, so you will be doing yourself and your users a favour if you deallocate speech channels when you are finished with them, rather than letting the Java garbage collector clean up after you.

Native MacOS Memory issues

The main speech output routines, speakText and speakBuffer, allocate memory on the native heap of the MacOS application hosting the Java Virtual Machine (typically the application that you create with JBindery or MRJAppBuilder). This should not be confused with the Java Virtual Machine's heap, where ordinary Java objects and data is stored; this is managed by Java's garbage collector and on Mac OS 9 is allocated out of the separate temp memory heap. The size of the native heap is controlled from the applications 'Get Info' dialog box settings. Therefore you need to use the settings in JBindery or MRJAppBuilder to set your application's native memory partition when you first create it. If you want to allocate a huge string for speaking, you will need to make sure that there is enough memory available in the 'Get Info' partition. (Of course, you should not force the user to listen to huge chunks uninterrupted speech)

You should definitely take note of is that the memory requirement for Speech Channels discussed above also comes out of the native heap. If you open multiple speech channels simultaneously you will soon run up against the limits of available memory in the Mac OS application partition set in 'GetInfo'. Fortunately, dispose()ing of a channel frees up the memory and you seldom need more than two channels to exist simultaneously. Typically you can work with one main speaking voice, and another voice that chimes in with status messages and the like. e.g. I have an application that uses one MacinTalk Pro voice and one MacinTalk 3 voice and it runs quite happily with a native partition of 3MB (this is over generous, 2..2.5MB would probably be enough). If you run out of native heap memory you will see a series of nasty bus and address errors (types 1 and 2) that will crash your application. If this happens you need to use JBindery or MRJAppBuilder to make the application's native partition larger. [Mac OS 9]

Thread Issues

Many events associated with Speech output (e.g. notification of the end of a word) are asynchronous, i.e. the notification can arrive at any time, as the text is spoken aloud to the user. To handle this it is necessary for JSpeechLib (and hence JSAPI) to create multiple threads to manage events. Any JSpeechLib speech channel will have at least one thread associated with it, JSAPI Synthesisers have one also, so each JSAPI synthesiser introduces a minimum of 2 threads into your application. Each time you register to receive another kind of event, you cause a new thread to be created to deliver those events. These threads spend almost all of their time blocked and only awake for short periods to deliver events to your code.

As with other listener pattern or callback based APIs it is important that your event handlers are quick and efficient, to avoid blocking any of the JSpeechLib library's event handling threads.

Copyright and License

The Java™ Speech API was developed by Sun Microsystems

The original JSAPI for Macintosh implementation is Copyright ©2000 Brendan Burns, except changes made to produce version 0.5 which are Copyright ©2000 Andrew Thompson.

JSpeechLib and associated documentation is Copyright ©1997-2000 Andrew Thompson

You rights with respect to this library, both binary and source code and are defined by the GNU Lesser General Public License. A copy has also been included with this distribution. You should read the license carefully as it governs your rights with respect to this library.