VoiceSauce - A program for voice analysis

VoiceSauce is an application, implemented in Matlab, which provides automated voice measurements over time from audio recordings. Inputs are standard wave (*.wav) files and the measures currently computed are:

where (*) indicates that the harmonic/spectral amplitudes are reported with and without corrects for formant frequencies and bandwidths. More parameters to be added soon.

Requirements:

VoiceSauce requires Matlab versions 2007 and up. VoiceSauce has been successfully run under Windows (XP, Vista, 7) and Mac (Leopard). Other operating systems may also work but have not been tested. If you are attempting to run VoiceSauce on a system other than Windows or Mac, you may need to install Tcl/Tk first; this can be obtained on ActiveState's website.

Limitations:

Since many of the parameters estimated by VoiceSauce depend on F0, meaningful results are only valid for voiced speech. Noisy speech may affect the accuracy of the F0 estimations and hence the values of the voice measurements.

The correction formula for the effects of the formant frequencies work best when there are accurate estimates of the the first formant, i.e. when F0 and F1 do not come too close to each other. For example, speech produced by a high-pitched voice saying high vowels may return inaccurate results. It is recommended to inspect the formant frequency estimates to verify their validity.

It has been reported that wav files contained in folder names which consist of non-English characters may cause the formant estimator to fail. Equally, textgrid files from Praat encoded with "UCS-2 Big Endian" cannot be read by Matlab and will cause it to crash. Such textgrid files need to be re-saved as ANSI or UTF-8, which can be done in e.g. Notepad (Open -> Save As, under encoding select ANSI) before they can be used with VoiceSauce.

Computer memory can be an issue. Very long files for which all parameters are to be estimated may cause VoiceSauce to hang up, or to give an Insufficient Memory message. Computing fewer parameters at once, or dividing the files into smaller files, should help.


Download:

Distribution is currently in two forms: (1) m-code for systems with Matlab, and (2) compiled executables for systems without Matlab. Note that the compiled executables requires the installation of the Matlab Component Runtime (only needs to be installed once).

Currently compiled executables are only available for Windows systems. We welcome assistance from anyone who would like to provide a legal compiled executable for Macs.

Version changelog is available here. Note - the 2K measure in versions prior to 1.16 was calculated incorrectly, please download the latest version.

Matlab m-code
(v1.18 - Jul 24, 2014)

Compiled Matlab executables - Windows XP/Vista/7
(v1.18 - Jul 24, 2014)

VoiceSauce.zip (10.2MB)

Instructions: Unzip and run VoiceSauce.m from Matlab.

Note: Requires Matlab 2007a or later.

Matlab Component Runtime - MCRInstaller.exe (179MB)
VoiceSauce_bin.zip (15.8MB)

Instructions: Run MCRInstaller.exe (only needs to be done once). Unzip VoiceSauce_bin.zip and run VoiceSauce.exe.

Note: Running VoiceSauce.exe for the first time may take a few minutes to load.


Documentation:

Documentaton (thanks to Chad Vicenik and Spencer Lin) is available here. Note that this may, and likely will, evolve over time.

 

Note about running VoiceSauce: 

VoiceSauce's Matlab console provides various run-time messages about its dealings with individual input files. These are not necessarily error messages! Unless VoiceSauce actually crashes while running, you should be able to find .mat output files in the folder you specified, and you should be able to produce a .txt output file from these. Most notably, "Multicue failed: switching to exstraightsource" is not an error message and does not mean that VoiceSauce has crashed. See the documentation for more information about this message.

Companion software:

EggWorks: A free program by Henry Tehrani, created for the NSF Voice project to analyze EGG signals (closing quotients, peak increase in contact) in batch mode; also includes utilities for splitting .pmf files into separate .wav files, for inverting .wav files, and for converting .wav files from 32- to 16-bit.
EggWorks can be found here (download link is at the bottom of the page).

 

Acknowledgements:

This work was supported in part by the NSF.


Questions, bug reports, and comments to yshue@ucla.edu.