AMI project

Augmented Multi-party Interaction

Reseach leader:Heřmanský Hynek
Team leaders:Burget Lukáš, Černocký Jan, Grézl František, Kadlec Jaroslav, Karafiát Martin, Matějka Pavel, Motlíček Petr, Pečiva Jan, Potúček Igor, Schwarz Petr, Sumec Stanislav, Španěl Michal, Zemčík Pavel
Agency:EU-6FP-IST
Code:506811-AMI
Start:2004
End:2006
Keywords:multi-modal interaction, speech recognition, video processing, multi-modal recognition, meeting data collection, meeting data annotation
Annotation:
Jointly managed by Prof. Herve Bourlard (IDIAP, http://www.idiap.ch) and Prof. Steve Renals (University of Edinburgh, http://www.iccs.informatics.ed.ac.uk), AMI targets computer enhanced multi-modal interaction in the context of meetings. The project aims at substantially advancing the state-of-the-art, within important underpinning technologies (such as human-human communication modeling, speech recognition, computer vision, multimedia indexing and retrieval). It will also produce tools for off-line and on-line browsing of multi-modal meeting data, including meeting structure analysis and summarizing functions. The project also makes recorded and annotated multimodal meeting data widely available for the European research community, thereby contributing to the research infrastructure in the field.

Products

Publications

2009Karafiát, M.: Study of linear transformations applied to training of cross-domain adapted large vocabulary continuous speech recognition systems, Brno, CZ, 2009, p. 73
2008Grézl, F., Fousek, P.: Optimizing bottle-neck features for LVCSR, In: 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, Nevada, US, IEEESP, 2008, p. 4729-4732, ISBN 1-4244-1484-9
2007Kadlec, J.: Code Characterization for Automatic User Interface Generation, In: Innovations and Advanced Techniques in Computer and Information Sciences and Engineering, Dordrecht, NL, Springer, 2007, p. 255-260, ISBN 978-1-4020-6267-4
 Karafiát, M., Burget, L., Černocký, J., Hain, T.: Real-Time ASR from Meetings, In: Proc. INTERSPEECH 2007, Antwerpen, BE, ISCA, 2007, p. 4, ISSN 1990-9772
2006Al-Hames, M., Hain, T., Černocký, J., Schreiber, S., Poel, M., Müller, R., Marcel, S., van, L., D., Odobez, J., Ba, S., Bourlard, H., Cardinaux, F., Gatica-Perez, D., Janin, A., Motlíček, P., Reiter, S., Renals, S., van, R., J., Rienks, R., Rigoll, G., Smith, K., Thean, A., Zemčík, P.: Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers, In: Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006), Washington D.C., US, 2006, p. 12
 Beran, V., Gatica-Perez, D., Potúček, I., Rigoll, G., Schreiber, S., Smith, K.: Multi-Person Tracking in Meetings: A Comparative Study, In: Machine Learning for Multimodal Interaction, Washington DC, US, 2006, p. 12
 Burget, L., Černocký, J., Fapšo, M., Karafiát, M., Matějka, P., Schwarz, P., Smrž, P., Szőke, I.: Indexing and search methods for spoken documents, In: Proceedings of the Ninth International Conference on Text, Speech and Dialogue, TSD 2006, Berlin, DE, Springer, 2006, p. 351-358, ISSN 0302-9743
 Burget, L., Matějka, P., Černocký, J.: Discriminative Training Techniques for Acoustic Language Identification, In: Proceedings of ICASSP 2006, Toulouse, FR, 2006, p. 209-212
 Černocký, J., Matějka, P., Burget, L., Schwarz, P.: Automatic Language Identification System, In: Sborník příspěvků z odborného semináře "Nové technologie v radiokomunikacích", Brno, CZ, UNOB, 2006, p. 1-6
 Černocký, J., Potúček, I., Sumec, S., Zemčík, P. et al: AMI Mobile Meeting Capture and Analysis System, Washington, US, 2006, p. 1
 Fapšo, M., Schwarz, P., Szőke, I., Smrž, P., Schwarz, M., Černocký, J., Karafiát, M., Burget, L.: Search Engine for Information Retrieval from Speech Records, In: Proceedings of the Third International Seminar on Computer Treatment of Slavic and East European Languages, Bratislava, SK, 2006, p. 100-101
 Fapšo, M., Smrž, P., Schwarz, P., Szőke, I., Schwarz, M., Černocký, J., Karafiát, M., Burget, L.: Information Retrieval from Spoken Documents, In: Proceedings of the Seventh International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2006), Mexico City, MX, Springer, 2006, p. 410-416, ISBN 3-540-32205-1
 Gatica-Perez, D., Rigoll, G., Schreiber, S., Smith, K., Potúček, I., Beran, V.: 2D Multi-Person Tracking: A Comparative Study in AMI Meetings, In: Lecture Notes in Computer Science, ..., GB, Springer Science+Business Media, 2006, p. 12, ISBN 978-3-540-69567-7
 Hain, T., Burget, L., Dines, J., Garau, G., Karafiát, M., Lincoln, M., Wan, V.: The AMI Meeting Transcription System, In: Proc. NIST Rich Transcription 2006 Spring Meeting Recognition Evaluation Worskhop, Washington D.C., US, NIST, 2006, p. 12
 Hradiš, M., Juránek, R.: Head Tracking in Meeting Video, In: Proceedings of the 12th Conference STUDENT EEICT 2006 Volume 2, Brno, CZ, VUT v Brně, 2006, p. 203-205, ISBN 80-214-3161-X
 Hradiš, M., Juránek, R.: Real-time Tracking of Participants in Meeting Video, In: Proceedings of The 10th Central European Seminar on Computer Graphics, Budměřice, SK, 2006, p. 5
 Karafiát, M., Grézl, F., Schwarz, P., Burget, L., Černocký, J.: Robust heteroscedastic linear discriminant analysis and LCRC posterior features in large vocabulary continuous speech recognition, In: Proc. Fifth Slovenian and First International Language Technologies Conference, Ljubljana, SI, 2006, p. 1-4
 Karafiát, M., Grézl, F., Schwarz, P., Burget, L., Černocký, J.: Robust heteroscedastic linear discriminant analysis and LCRC posterior features in meeting data recognition, In: Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006), Berlin, DE, Springer, 2006, p. 275-284, ISBN 3-540-69267-3
 Kontár, S.: Parallel training of neural networks for speech recognition, In: Proc. 12th International Conference on Soft Computing MENDEL'06, Brno, CZ, VUT v Brně, 2006, p. 6, ISBN 80-214-3195-4
 Kopecký, J., Szőke, I., Fapšo, M., Karafiát, M., Burget, L., Oparin, I., Schwarz, P., Matějka, P., Černocký, J., Glembek, O.: BUT System for NIST STD 2006 - Arabic, In: Proc. NIST SPoken Term Detection Evaluation workshop (STD 2006), Washington D.C., US, NIST, 2006, p. 15
 Matějka, P., Burget, L., Schwarz, P., Černocký, J.: Brno University of Technology System for NIST 2005 Language Recognition Evaluation, In: Proceedings of Odyssey 2006: The Speaker and Language Recognition Workshop, San Juan, PR, 2006, p. 57-64, ISBN 1-4244-0472-X
 Matějka, P., Burget, L., Schwarz, P., Černocký, J.: NIST Speaker Recognition Evaluation 2006, In: Proceedings of NIST Speaker Recognition Evaluation 2006, San Juan, PR, NIST, 2006, p. 1-40
 Matějka, P., Burget, L., Schwarz, P., Černocký, J.: NIST 2005 Language Recognition Evaluation, In: Proceedings of NIST LRE 2005, Washington DC, US, NIST, 2006, p. 1-37
 Matějka, P., Schwarz, P., Burget, L., Černocký, J.: Use of anti-models to furher improve state-of-the-art PRLM language recognition system, In: Proceedings of ICASSP 2006, Toulouse, FR, 2006, p. 197-200
 Pečiva, J.: Active Transaction Approach for Collaborative Virtual Environments, In: ACM International Conference on Virtual Reality Continuum and its Applications (VRCIA), Chinese University of Hong Kong, HK, ACM, 2006, p. 171-178, ISBN 1-59593-324-7
 Schwarz, P., Matějka, P., Černocký, J.: Hierarchical structures of neural networks for phoneme recognition, In: Proceedings of ICASSP 2006, Toulouse, FR, 2006, p. 325-328
 Stolcke, A., Grézl, F., Hwang, M., Lei, X., Morgan, N., Vergyri, D.: Cross-Domain and Cross-Language Portability of Acoustic Features Estimated by Multilayer Perceptrons, In: 2006 IEEE International Conference on Acoustic, Speech, and Signal Processing, Toulouse, FR, IEEESP, 2006, p. 321-324, ISBN 978-3-540-74627-0
 Szőke, I., Fapšo, M., Karafiát, M., Burget, L., Grézl, F., Schwarz, P., Glembek, O., Matějka, P., Kontár, S., Černocký, J.: BUT System for NIST STD 2006 - English, In: Proc. NIST SPoken Term Detection Evaluation workshop (STD 2006), Washington D.C., US, NIST, 2006, p. 26
 Szőke, I.: Keyword Spotting in Meeting Data, In: Proceedings of the 12th Conference Student EEICT 2006 Volume 4, Brno, CZ, FEKT VUT, 2006, p. 440-444, ISBN 80-214-3163-6
 Zemčík, P., Herout, A., Beran, V., Sumec, S., Potúček, I.: Real-Time Visual Processing Using "Views", In: Poster, MLMI Conference, Washington, DC, US, 2006, p. 1
2005Ashby, S., Bourban, S., Carletta, J., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI Meeting Corpus: A Pre-Announcement, In: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), Edinburgh, GB, 2005, p. 4
 Ashby, S., Bourban, S., Carletta, J., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI Meeting Corpus, In: Measuring Behavior 2005 Proceedings Book, Wageningen, NL, 2005, p. 4
 Fapšo, M., Schwarz, P., Szőke, I., Černocký, J., Smrž, P., Burget, L., Karafiát, M.: Search Engine for Information Retrieval from Multi-modal Records, Edinburgh, GB, 2005, p. 1
 Fapšo, M., Smrž, P., Schwarz, P., Szőke, I., Burget, L., Karafiát, M., Černocký, J.: Systém pre efektívne vyhľadávanie v rečových databázach, In: Sborník databázové konference DATAKON 2005, Brno, CZ, MUNI, 2005, p. 323-333, ISBN 80-210-3813-6
 Grézl, F.: Spectral plane investigation for probabilistic features for ASR, Edinburgh, GB, 2005, p. 5
 Hain, T., Burget, L., Dines, J., Garau, G., Karafiát, M., Lincoln, M., McCowan, I., Moore, D., Wan, V., Ordelman, R., Renals, S.: The 2005 AMI System for the Transcription of Speech in Meetings, In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers, Edinburgh, GB, UEDIN, 2005, p. 450-462, ISBN 978-3-540-32549-9
 Hain, T., Karafiát, M., Dines, J., McCowan, I., Lincoln, M., Garau, G., Wan, V., Ordelman, R., Renals, S.: The Development of the AMI System for the Transcription of Speech in Meetings, In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers, Edinburgh, GB, UEDIN, 2005, p. 344-356, ISBN 978-3-540-32549-9
 Hain, T., Karafiát, M., Garau, G., Moore, D., Wan, V., Ordelman, R., Renals, S.: Transcription of Conference Room Meetings: an Investigation, In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology, Lisabon, PT, ISCA, 2005, p. 4, ISSN 1018-4074
 Kadlec, J., Potúček, I., Sumec, S., Zemčík, P.: Evaluation of Tracking and Recognition Methods, In: Proceedings of the 11th conference EEICT, Brno, CZ, 2005, p. 617-622, ISBN 80-214-2890-2
 Karafiát, M., Burget, L., Černocký, J.: Using Smoothed Heteroscedastic Linear Discriminant Analysis in Large Vocabulary Continuous Speech Recognition System, In: 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Edinbourgh, Scotland, GB, UEDIN, 2005, p. 8
 Matějka, P., Schwarz, P., Černocký, J., Chytil, P.: Phonotactic Language Identification using High Quality Phoneme Recognition, In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology, Lisbon, PT, ISCA, 2005, p. 2237-2240, ISSN 1018-4074
 Matějka, P., Schwarz, P., Černocký, J., Chytil, P.: Phonotactic Language Identification, In: Proceedings of Radioelektronika 2005, Brno, CZ, FEKT VUT, 2005, p. 140-143, ISBN 80-214-2904-6
 Matějka, P.: Phoneme Recognition Tuning for Language Identification System, In: Proceedings of the 11th conference STUDENT EEICT 2005, Brno, CZ, FEKT VUT, 2005, p. 658-653, ISBN 80-214-2890-2
 Motlíček, P., Burget, L., Černocký, J.: Non-parametric Speaker Turn Segmentation of Meeting Data, In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology, Lisabon, PT, ISCA, 2005, p. 657-660, ISSN 1018-4074
 Motlíček, P., Burget, L., Černocký, J.: VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION, In: Radioelektronika 2005, Brno, CZ, FEKT VUT, 2005, p. 187-190, ISBN 80-214-2904-6
 Nijholt, A., Zwiers, J., Pečiva, J.: The Distributed Virtual Meeting Room Exercise, In: Proceedings ICMI 2005 Workshop on Multimodal multiparty meeting processing, Trento, IT, 2005, p. 93-99
 Pečiva, J.: Omnipresent Collaborative Virtual Environments for Open Inventor Applications, In: INTETAIN 2005, Madonna di Campiglio, IT, Springer, 2005, p. 272-276, ISBN 3-540-30509-2
 Potúček, I.: Automatic Image Stabilization for Omni-Directional Systems, In: Proceedings of the Fifth IASTED International Conference on VISUALIZATION, IMAGING,AND IMAGE PROCESSING, Benidorm, ES, ACTA Press, 2005, p. 338-342
 Smrž, P., Fapšo, M.: Vyhledávání v záznamech přednášek, In: Sborník semináře Technologie pro e-vzdělávání, Praha, CZ, ČVUT, 2005, p. 21-26, ISBN 80-01-03274-4
 Smrž, P.: Parallel Metagrammar for Closely Related Languages - A Case Study of Czech and Russian, In: Research on Language & Computation, Vol. 3, No. 2, 2005, DE, p. 101-128, ISSN 1570-7075
 Stolcke, A., Anguera, X., Boakye, K., Cetin, Ö., Grézl, F., Janin, A., Mandal, A., Peskin, B., Wooters, C., Zheng, J.: Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System, In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers, Edinburgh, Scotland, GB, UEDIN, 2005, p. 463-475, ISBN 978-3-540-32549-9
 Sumec, S., Kadlec, J.: Event Editor - The Multi-Modal Annotation Tool, In: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), Edinburgh, GB, 2005, p. 1
 Sumec, S., Potúček, I., Zemčík, P.: AUTOMATIC MOBILE MEETING ROOM, In: Proceedings of 3IA'2005 International Conference in Computer Graphics and Artificial Intelligence, Limoges, FR, 2005, p. 171-177, ISBN 2-914256-07-8
 Szőke, I., Schwarz, P., Burget, L., Fapšo, M., Karafiát, M., Černocký, J., Matějka, P.: Comparison of Keyword Spotting Approaches for Informal Continuous Speech, In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology, Lisabon, PT, 2005, p. 633-636, ISSN 1018-4074
 Szőke, I., Schwarz, P., Burget, L., Karafiát, M., Černocký, J.: Phoneme based acoustics keyword spotting in informal continuous speech, In: Radioelektronika 2005, Brno, CZ, FEKT VUT, 2005, p. 195-198, ISBN 80-214-2904-6
 Szőke, I., Schwarz, P., Burget, L., Karafiát, M., Matějka, P., Černocký, J.: Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech, In: Lecture Notes in Computer Science, Vol. 2005, No. 3658, DE, p. 8, ISSN 0302-9743
 Szőke, I., Schwarz, P., Matějka, P., Burget, L., Fapšo, M., Karafiát, M., Černocký, J.: Comparison of Keyword Spotting Approaches for Informal Continuous Speech, In: 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Edinburgh, GB, 2005, p. 12
 Zhu, Q., Chen, B., Grézl, F., Morgan, N.: Improved MLP Structures for Data-Driven Feature Extraction for ASR, In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology, Lisabon, PT, 2005, p. 4, ISSN 1018-4074
2004Beran, V., Potúček, I.: REAL-TIME RECONSTRUCTION OF INCOMPLETE HUMAN MODEL USING COMPUTER VISION, In: Proceeding of the 10th Conference and Competition STUDENT EEICT 2004, Volume 2, Brno, CZ, 2004, p. 298-302, ISBN 80-214-2635-7
 Beran, V.: Augmented Multi-User Communication System, In: Proceedings of the working conference on Advanced visual interfaces, Gallipoli, IT, ACM, 2004, p. 257-260, ISBN 1-58113-867-9
 Burget, L.: Combination of Speech Features Using Smoothed Heteroscedastic Linear Discriminant Analysis, In: Proc. 8th International Conference on Spoken Language Processing, Jeju island, KR, Sunjin, 2004, p. 2549-2552
 Burget, L.: Complementarity of Speech Recognition Systems and System Combination, Brno, CZ, FIT VUT, 2004, p. 145
 Fousek, P., Svojanovský, P., Grézl, F., Heřmanský, H.: New Nonsense Syllables Database - Analyses and Preliminary ASR Experiments, In: Proc. 8th International Conference on Spoken Language Processing, Jeju Island, KR, Sunjin, 2004, p. 348-351, ISSN 1225-4111
 Fučík, O., Zemčík, P., Tupec, P., Bryan, L., Herout, A.: The Networked Photo-Enforcement and Traffic Monitoring System, In: Proceedings of Engineering of Computer-Based Systems, Los Alamitos, US, IEEE CS, 2004, p. 423-428, ISBN 0-7695-2125-8
 Herout, A., Zemčík, P., Beran, V., Kadlec, J.: Image and Video Processing Software Framework for Fast Application Development, In: Joint AMI/PASCAL/IM2/M4 workshop, Martigny, CH, IDIAP, 2004, p. 1
 Herout, A., Zemčík, P.: Animated Particle Rendering in DSP and FPGA, In: SCCG 2004 Proceedings, Bratislava, SK, STUBA, 2004, p. 237-242, ISBN 80-223-1918-X
 Karafiát, M., Grézl, F., Burget, L.: Combination of MFCC and TRAP features for LVCSR of meeting data, Martigny, CH, 2004, p. 1
 Karafiát, M., Grézl, F., Černocký, J.: TRAP based features for LVCSR of meeting data, In: Proc. 8th International Conference on Spoken Language Processing, Jeju Island, KR, Sunjin, 2004, p. 437-440, ISSN 1225-4111
 Motlíček, P., Burget, L., Černocký, J.: PHONEME RECOGNITION OF MEETINGS USING AUDIO-VISUAL DATA, AMI Workshop, Martigny, CH, 2004, p. 6
 Motlíček, P., Černocký, J.: Multimodal Phoneme Recognition of Meeting Data, In: 7th International Conference, TSD 2004 Brno, Czech Republic, September 2004 Proceedings, Brno, CZ, Springer, 2004, p. 379-384, ISBN 3-540-23049-1, ISSN 0302-9743
 Motlíček, P., Černocký, J.: Multimodal Phoneme Recognition of Meeting Data, In: Lecture Notes in Computer Science, Vol. 2004, No. 3206, DE, p. 6, ISSN 0302-9743
 Motlíček, P.: Segmentace nahrávek živých jednání podle mluvčího, In: Sborník příspěvků a prezentací akce Odborné semináře 2004, Brno, CZ, UREL FEKT VUT, 2004, p. 28
 Motlíček, P.: Visual Feature Extreaction for Phoneme Recognition of Meetings, Brno, CZ, UPGM FIT VUT, 2004, p. 14
 Pečiva, J.: Collaborative Virtual Environments, In: Poster at MLMI'04 workshop, Martigny, CH, IDIAP, 2004, p. 1-1
 Schwarz, P., Matějka, P., Černocký, J.: Phoneme Recognition from a Long Temporal Context, In: poster at JOINT AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Martigny, CH, IDIAP, 2004, p. 1-1
Official website: