BUT recognizer for noise and reverberated environments among the world's best

Speech recognition is "relatively easy" for close-talk microphones in clean environments. In noise and with distant mikes, the performances degrade rapidly. The funding agencies are aware of this and the U.S. IARPA organized ASpIRE (Automatic Speech recognition in Reverberant Environments) challenge, where recognizers for these difficult environments were compared. The BUT team - Martin Karafiát, Lukáš Burget, Igor Szöke, František Grézl - together with colleagues from Raytheon BBN and Johns Hopkins University, came up with a system that scored among the best in the “Single Microphone” category - Due to lion's share of the BUT team on the whole system, BUT folks are grabbing 2/3 of the 30.000US$ prize. To look under the hood of the system, see The crucial thing is to make the daty as messy as possible!

Tetsuji Ogawa, Harish Mallidi and Ruizhi Li in Brno

Deep neural net classifiers, which currently revolutionaries machine learning, can be extremely powerful. However sometimes they act like fools - they are wrong without knowing that they are wrong. Prof. Tetsuji Ogawa from Waseda University in Tokyo, and Harish Mallidi and Ruizhi Li from the Johns Hopkins University in Baltimore joined our group to work this Summer with Prof. Hermansky on a new generation of machines that “know when they do not know”. Tetsuji will stay with us till mid-August, while Harish and Ruizhi have got their internship till the end of 2015.

Alicia Lozano in Brno for 4 months

Alicia Lozano (from the Biometric Recognition Group - ATVS at Universidad Autonoma de Madrid, Spain) came to Brno for 4 months on a Spanish Government PhD travel grant. Alicia is working on language recognition using neural nets and in Brno, she will investigate into speaker recognition using NN-derived bottle-neck features. See Alicia's page and diploma thesis.

Hynek in Brno for a year!

Prof. Hynek Hermansky was, is and will always be a member and guru of BUT Speech@FIT group. Most of the time, he is a distant guru, being busy with his position of Director of Johns Hopkins Center for Language and Speech Processing (CLSP). 2015 will be however marked by his sabbatical in Brno. Welcome back Hynku! (for those not familiar in Czech, "Hynku" is a vocative of "Hynek". Yes, in Czech, we have 7 grammatical cases and yes, we can decline anyone's name).

BISON started

BISON started

It's tough to get European funding and we are very happy that we got it for BISON (BIg Speech data analytics for cONtact centers). We are even more happy that the coordinator is Phonexia, founded by 6 BUT Speech@FIT members in 2006. It is actually the only H2020 ICT project that has a Czech coordinator! For 3 years, we'll be playing with data from contact centers working in a consortium of 8 partners from 5 EU countries. In mid-January, the kick-off-meeting took place at BUT.

  • 50-100k people in the Czech Republic are working in the contact center business? This is 1% of its population.
  the consortium plans to cooperate with Brno Municipal Police
  • the name of BISON coordinator in Phonexia, Milan Schwarz, might sound familiar ... yes, Milan is brother of Petr, the developer of BUT's famous phone recognition software (and now Phonexia CEO)

Santosh Kesiraju from IIIT Hyderabad in Brno

Santosh Kesiraju from IIIT Hyderabad joined BUT group for 6 months to work on diarization and topic detection problems while taking into account different inputs (acoustics, phoneme, sub-word or word recognition). Santosh is from a known Indian speech group headed by Prof. Yegnanarayana.

TACR TRPIT is over

December 2014 marked the end of a 4-year project "Technologie zpracování řeči pro efektivní komunikaci člověk-počítač" (Technologies of speech processing for efficient human-machine communication) sponsored by the Technology Agency of the Czech Republic. BUT lead the consortium of 4 partners:

While BUT was responsible for core research, each industrial partner was responsible for a sector of application: Phonexia for security/defence, Lingea for interfaces to electronic dictionaries and translation systems and Optimsys for interactive voice response (IVR) systems. BUT's system for lecture browsing, now available commercially as SuperLectures, was also partly supported by the project.

Dr. Olda and Dr. Misko

On 17.12.2014, two successful PhD defenses took place: Oldrich ("Olda") Plchot defended his thesis "Extensions to Probabilistic Linear Discriminant Analysis for Speaker Recognition" and Michal ("Misko") Fapso defended his work "Query-by-Example Spoken Term Detection". Olda continues to work at BUT as post-doc (and is actually crucial in many efforts concerning speaker and language ID, such as DARPA RATS), Misko is working with Escape Motions, a Slovak company developing computer animation software.