Vox Occulta

Die Software-ITK Anwendung Vox Occulta

Vox Occulta actually stands for a whole family of ITC applications. The basic principle is voice synthesis, which is based on the emulation of a virtual human larynx and its resonance cavities. Vox-Occulta I was still based on an electromechanical solution with a chaos generator and mechanical components such as springs and membranes manipulate voice production. Vox Occulta II was already a pure software solution and the further revisions were mixtures of both.

With Rev.5, I already wrote software that provides good results without any additional hardware. Rev. 6 now is a variant of Rev.5 that gives a more natural sounding synthesized voice signal compared to Rev.5. However since perception is always subjective I decided to provide both variants on my website. Thus I hope that everyone will find the flavour he can work with the best. The software, as usual with my applications, can be used directly online in the browser. However, I also offer it for download. The software is my intellectual property, but I I make it available free of charge.

Operating the Vox Oculta VI Software

The software has a lot of controls that influence the sound, the speed of speech and the perception of the experimenter. The logical flow of settings starts at the top left. The following table explains the functions of the controls. These are divided into function groups.

The first function group shows three pulse frequencies.This is one of the main differences compared to Rev.5 that used only two of them and make the the result sounding more natural. These frequencies correspond to the vibration frequency of the human vocal chords and thus define the pitch of the voice. In my perception the signal sounds good if the frequencies deviate only by something around 10Hz from each other. But you may find another setting more suitable for yourself.

Function Group Basic Frequency Settings
Pulse Frequency 1..3 Frequency setting of impulse source
Volume The volume of the resulting signal

The second function group modulates both pulse sources in frequency via a random generator. The exact modulation process is complex and was worked out by me after long research.

Function Group Frequency Modulation Settings
FM-Range Specifies the modulation depth

The third function group controls the rhythm of speech production.

Funktion Group Randomizer
Speech Speed This parameter controls the speed of the speech

The fourth function group contains various parameters for the spectral enhancement of the resulting speech. A key feature is a phalanx of parallel working bandfilters to shape the signal from the impulse sources.

Function Group Spectral Processing
Filter Bandwidth This software uses controllable band filters for speech shaping. The bandwidth of these filters can be changed here
Amplitude BF1...BF4 The output strength of each bandfilter can be adjusted here. These settings are crucial for the resulting sound of the voice.
Consonant Injection Consonants are always a problem in speech synthesis. This control adds more components to the signal, which in the manifestation lead to the formation of consonants
Pulse Harm. Mix This control can be used to adjust the ratio between the pure pulse signal and the spectral processing. The control has a very large influence on the sound image.

The fifth function group contains various parameters for configuring reverb.

Function Group Reverb Settings
Delay The delay of the resulting signal is set here
Decay The time the reverb signal is sustained is set here
Wet/Dry Mix The ratio of unaffected to reverb signal is set here

There is also a series of buttons with different functions. The buttons trigger various functions and control the program sequence.

Voice
Start Starts voice synthesis with configured parameters
Stop Stops the voice.

Settings
Save Setting All parameters are saved locally as a data set. No file name needs to be assigned.
Load Setting If a previously saved parameter data set is available, it is loaded by clicking on this button.

Recording
The program provides an option to record the generated voice signal as a WAVE file, so that no recording with the microphone is necessary.
Symbol "REC" Clicking on this symbol starts the recording process. The display in the panel changes to "RECORDING" and a tape counter shows the recording time. An expiring timer is also displayed, which automatically ends the recording after 600s. This is to ensure that the recording file does not become too large.
Symbol "STOP" Stops the recording. Display message changes to "STOPPED" and tape counter stops
Symbol "FLOPPY DISK" Save the recording with the standard filename "Vox-Occulta6.wav" in den download area of your Browser

The app offers the option of playing the synthesized voice via the PC speakers or via the built-in recording function. In my opinion, the former method produces slightly better results. The optimal setting of all parameters is quite complex and everyone hears voices differently. You therefore have to find the optimum parameters yourself. The program starts with the parameters that I myself consider to be optimal.

The resulting voice manifestations have to be post processed, although not much. I use the audio editor Audacity and the process chain of using Paulstretch (enlargement factor 1.2 und 0,01s time resolution) and the Equalizer to enhance the higher frequencies. Then I play back the signal in small loops of 1s-3s length and identify the EVP's.

Important: The program has some peculiarities that I do not yet fully understand. Therefore, after starting the voice, you should adjust the delay control once. The voices then become louder and clearer!

Start the app here