Software

Lattice Spoken Term Detection toolkit (LatticeSTD)

Toolkit for experiments with lattice based spoken term detection. It allows you to define set of terms and search them in lattices.

Precisely:
* Searches for defined sequence of links in lattice and outputs label assign to this sequence.
* Calculates confidence of found sequence (posterior probability)
* Filter overlapped detection
* Allows handle substitutions/deletion/insertion (useful for phone lattices)

KWSViewer - Interactive viewer for Keyword spotting output

Abstract

This tool can load an output of a keyword-spotting system (KWS) and reference file in HTK-MLF format and show detections in a tabular view. You can also use it to replay detections, tune and visualize scores, hits, misses and false-alarms using sliders on the right-side panel.

Installation - Windows

Download here kwsviewer_v1.5_win32.zip (6MB). No installation is needed. You can run kwsviewer.exe directly without any installation.

Joint Factor Analysis Matlab Demo

This set of Matlab functions and data by Ondrej Glembek (glembek@fit.vutbr.cz) is a simple tutorial of Joint Factor Analysis (JFA), as it was investigated at the JHU 2008 workshop http://www.clsp.jhu.edu/workshops/ws08/groups/rsrovc/.

The tutorial is based on Patrick Kenny's paper:

Kenny, P "Joint factor analysis of speaker and session variability: Theory and algorithms" - Technical report CRIM-06/08-13 Montreal, CRIM, 2005, http://www.crim.ca/perso/patrick.kenny/

especially on the simplified version of the training in:

Web-based demo for Language Identification

a www-based demonstration of our phonotactic language identification. Try it out at http://speech.fit.vutbr.cz/lid-demo/
Arabic, English, Farsi, French, German, Hindu, Japanese, Korean, Mandarin, Spanish, Tamil, Vietnamese, Czech, Polish and Russian can be detected.

Software from the Speech Processing Group

HMM toolkit STK

This distribution includes SERest - a tool for embedded training of HMM's with supporting scripts. Key features of SERest include re-estimation of linear transformations (MLLT, LDA, HLDA) within the training process, and use of recognition networks for the training. More info here.

Phoneme recognizer based on long temporal context

The phoneme recognizer was developed at Brno University of Technology, Faculty of Information Technology and was successfully applied to tasks including language identification [4], indexing and search of audio records, and keyword spotting [5]. The main purpose of this distribution is research. Outputs from this phoneme recognizer can be used as a baseline for subsequent processing, as for example phonotactic language modeling.

Lattice Search Engine (LSE)

This package contains several tools. The main three of them are:
- indexing HTK lattices
- sorting the index
- searching in the sorted index for single words or phrases

Some of the features of these tools were not used for a long time and may contain bugs.