Vox Occulta

Die Software-ITK Anwendung Vox Occulta

Vox Occulta actually stands for a whole family of ITC applications. The basic principle is voice synthesis, which is based on the emulation of a virtual human larynx and its resonance cavities. Vox-Occulta I was still based on an electromechanical solution with a chaos generator and mechanical components such as springs and membranes manipulate voice production. Vox Occulta II was already a pure software solution and the further revisions were mixtures of both.

n Rev.5, I have now written software that provides good results without any additional hardware. The software, as usual with my applications, can be used directly online in the browser. However, I also offer it for download. The software is my intellectual property, but I I make it available free of charge.

Operating the Vox Oculta V Software

The software has a lot of controls that influence the sound, the speed of speech and the perception of the experimenter. The logical flow of settings starts at the top left. The following table explains the functions of the controls. These are divided into function groups.

The first function group shows two pulse frequencies. These correspond to the vibration frequency of the human vocal chords and thus define the pitch of the voice. The software uses two instead of one pulse source because this results in a richer sound. Both frequencies should deviate by 3-50 Hz approximately.

Function Group Basic Frequency Settings
Pulse Frequency 1 Frequency setting of first impulse source
Pulse Frequency 2 Frequency setting of second impulse source
Volume The volume of the resulting signal

The second function group modulates both pulse sources in frequency via a random generator. The exact modulation process is complex and was worked out by me after long research.

Function Group Frequency Modulation Settings
FM Angle Preset Specifies the bias of the frequency modulation
FM-Range Specifies the modulation depth

The third function group additionally generates a pulse width modulation at the pulse frequencies. This does not have a massive influence on the final result, but rounds off the sound.

Function Group PWM Settings
PWM Duty Cycle This value is the operating point of the pulse width around which the modulation takes place
PWM Range The modulation depth

The fourth function group controls the rhythm of speech production.

Funktion Group Randomizer
Speech Speed This parameter controls the speed of the speech

The fifth function group contains various parameters for the spectral enhancement of the resulting speech.

Function Group Spectral Processing
Filter Bandwidth This software uses controllable band filters for speech shaping. The bandwidth of these filters can be changed here
Noise Envelope To achieve more spectral richness of sound, a noise envelope can be placed around the signal. The strength can be adjusted with this control.
Consonant Injection Consonants are always a problem in speech synthesis. This control adds more components to the signal, which in the manifestation lead to the formation of consonants
Pulse Harm. Mix This control can be used to adjust the ratio between the pure pulse signal and the spectral processing. The control has a very large influence on the sound image.

The sixth function group contains various parameters for configuring reverb.

Function Group Reverb Settings
Delay The delay of the resulting signal is set here
Decay The time the reverb signal is sustained is set here
Wet/Dry Mix The ratio of unaffected to reverb signal is set here

There is also a series of buttons with different functions. The buttons trigger various functions and control the program sequence.

Voice
Start Starts voice synthesis with configured parameters
Stop Stops the voice.

Settings
Save Setting All parameters are saved locally as a data set. No file name needs to be assigned.
Load Setting If a previously saved parameter data set is available, it is loaded by clicking on this button.

Recording
The program provides an option to record the generated voice signal as a WAVE file, so that no recording with the microphone is necessary.
Symbol "REC" Clicking on this symbol starts the recording process. The display in the panel changes to "RECORDING" and a tape counter shows the recording time. An expiring timer is also displayed, which automatically ends the recording after 600s. This is to ensure that the recording file does not become too large.
Symbol "STOP" Stops the recording. Display message changes to "STOPPED" and tape counter stops
Symbol "DISKETTE" Save the recording with the standard filename "Vox-Occulta5.wav" in den download area of your Browser

The app offers the option of playing the synthesized voice via the PC speakers or via the built-in recording function. In my opinion, the former method produces slightly better results. The optimal setting of all parameters is quite complex and everyone hears voices differently. You therefore have to find the optimum parameters yourself. The program starts with the parameters that I myself consider to be optimal.

Important: The program has some peculiarities that I do not yet fully understand. Therefore, after starting the voice, you should adjust the delay control once. The voices then become louder and clearer!

Start the app here