The system, developed at Brno University of Technology, Faculty of Information Technology, is designated mainly for seeking for keywords in long speech records. After a list of speech audio files and a list of keywords that are to be detected is specified, the system automaticly selects parts of speech where keywords are pronunced.
Figure 1. The list of files and the list of keywords

Figure 2. The selected speech parts where keywords were detected

Solution description:

  • Keywords are modeled with triphone Hidden Markov Models
  • Czech SpeechDat-E speech database is used for model training
  • A Modified Viterbi algorithm [1] is used for keyword detection
  • An optimal thresholds for each keyword is precalculated
Treshold estimation:
 Each threshold is linearly depended on states contained in the keyword. The contribution of each state was obtained during a global criterion (false alarms and false acceptation) minimization and a linear equation system solution.


An public evaluation version is available at . In comparison to full version, it has the following limitations:
  • Only three audio records can be loaded.
  • Maximal length of one record is one minute.


