Neural Network Trainer TNet

Thanks for the interest in TNet! At the moment let's consider it as a dead project, as I fully switched the efforts to 'nnet1' recipe in kaldi : You can still use the TNet, but it is not to be extended anymore. Thanks!


TNet is a tool for parallel training of neural networks for classification, containing two independent sets of tools, the CPU and GPU tools. The CPU training is based on multithread data-parallelization, the GPU training is implemented in CUDA, both are implementing mini-batch Stochastic Gradient Descent, optimizing per-frame Cross-entropy.

The toolkit contains example of NN training on TIMIT, which can be easily transfered to your data. You may also be interested in hierarchical "Universal Context Network" a.k.a. "Stacked bottleneck netowork", which can be built using one-touch-script : tools/train/

Lattice Spoken Term Detection toolkit (LatticeSTD)

Toolkit for experiments with lattice based spoken term detection. It allows you to define set of terms and search them in lattices.

* Searches for defined sequence of links in lattice and outputs label assign to this sequence.
* Calculates confidence of found sequence (posterior probability)
* Filter overlapped detection
* Allows handle substitutions/deletion/insertion (useful for phone lattices)

KWSViewer - Interactive viewer for Keyword spotting output


This tool can load an output of a keyword-spotting system (KWS) and reference file in HTK-MLF format and show detections in a tabular view. You can also use it to replay detections, tune and visualize scores, hits, misses and false-alarms using sliders on the right-side panel.

Installation - Windows

Download here (6MB). No installation is needed. You can run kwsviewer.exe directly without any installation.

Joint Factor Analysis Matlab Demo

This set of Matlab functions and data by Ondrej Glembek ( is a simple tutorial of Joint Factor Analysis (JFA), as it was investigated at the JHU 2008 workshop

The tutorial is based on Patrick Kenny's paper:

Kenny, P "Joint factor analysis of speaker and session variability: Theory and algorithms" - Technical report CRIM-06/08-13 Montreal, CRIM, 2005,

especially on the simplified version of the training in:

Web-based demo for Language Identification

a www-based demonstration of our phonotactic language identification. Try it out at
Arabic, English, Farsi, French, German, Hindu, Japanese, Korean, Mandarin, Spanish, Tamil, Vietnamese, Czech, Polish and Russian can be detected.

Software from the Speech Processing Group

HMM toolkit STK

This distribution includes SERest - a tool for embedded training of HMM's with supporting scripts. Key features of SERest include re-estimation of linear transformations (MLLT, LDA, HLDA) within the training process, and use of recognition networks for the training. More info here.

Phoneme recognizer based on long temporal context

The phoneme recognizer was developed at Brno University of Technology, Faculty of Information Technology and was successfully applied to tasks including language identification [4], indexing and search of audio records, and keyword spotting [5]. The main purpose of this distribution is research. Outputs from this phoneme recognizer can be used as a baseline for subsequent processing, as for example phonotactic language modeling.

Lattice Search Engine (LSE)

This package contains several tools. The main three of them are:
- indexing HTK lattices
- sorting the index
- searching in the sorted index for single words or phrases

Some of the features of these tools were not used for a long time and may contain bugs.