M4

Multi Modal Meeting Manager

Reseach leader:Heřmanský Hynek
Team leaders:Černocký Jan, Zemčík Pavel
Agency:EU-HLT
Code:IST-2001-34485
Start:2002
End:2005
Keywords:speech processing, video processing, information merging, meeting summarization
Annotation:
The M4 project started in March 2002, and has a duration of three years. The overall objective of the project is the construction of a demonstration system to enable structuring, browsing and querying of an archive of automatically analysed meetings. The archived meetings will have taken place in a room equipped with multimodal sensors. For each meeting, audio, video, textual, and (possibly) interaction information will be available. Audio information will come from close talking and distant microphones, as well as binaural recordings. Video information will come from multiple cameras. While the video and audio information will form several streams of data generated during the meeting, the textual information---the agenda, discussion papers, text of slides---will be pre-generated and can be used to guide the automatic structuring of the meeting. The interaction stream consists of any information that can help in analysing events within the meeting, for example, mouse tracking from a PC-based presentation or laser pointing information.

Products

Publications

2009Karafiát, M.: Study of linear transformations applied to training of cross-domain adapted large vocabulary continuous speech recognition systems, Brno, CZ, 2009, p. 73
2005Motlíček, P., Burget, L., Černocký, J.: VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION, In: Radioelektronika 2005, Brno, CZ, FEKT VUT, 2005, p. 187-190, ISBN 80-214-2904-6
 Szőke, I., Schwarz, P., Burget, L., Karafiát, M., Černocký, J.: Phoneme based acoustics keyword spotting in informal continuous speech, In: Radioelektronika 2005, Brno, CZ, FEKT VUT, 2005, p. 195-198, ISBN 80-214-2904-6
2004Burget, L.: Combination of Speech Features Using Smoothed Heteroscedastic Linear Discriminant Analysis, In: Proc. 8th International Conference on Spoken Language Processing, Jeju island, KR, Sunjin, 2004, p. 2549-2552
 Burget, L.: Complementarity of Speech Recognition Systems and System Combination, Brno, CZ, FIT VUT, 2004, p. 145
 Burget, L.: Measurement of Complementarity of Recognition Systems, In: Proc. Seventh International conference on Text, Speech and Dialogue, Brno, CZ, Springer, 2004, p. 283-290, ISBN 3-540-23049-1
 Fousek, P., Svojanovský, P., Grézl, F., Heřmanský, H.: New Nonsense Syllables Database - Analyses and Preliminary ASR Experiments, In: Proc. 8th International Conference on Spoken Language Processing, Jeju Island, KR, Sunjin, 2004, p. 348-351, ISSN 1225-4111
 Grézl, F.: Combinations of TRAP-based systems, In: Proc. Seventh International conference on Text, Speech and Dialogue, Brno, CZ, FI MUNI, 2004, p. 323-330, ISBN 3-540-23049-1
 Jenderka, P., Potúček, I., Sumec, S.: Meeting recordings at Brno University of Technology, In: AMI/PASCAL/IM2/M4 workshop, Martigny, CH, 2004, p. 3
 Kadlec, J.: Lip detection in low resolution images, In: Proceeding of the 10th Conference and Competition STUDENT EEICT 2004, Volume 2, Brno, CZ, 2004, p. 303-306, ISBN 80-214-2635-7
 Karafiát, M., Grézl, F., Burget, L.: Combination of MFCC and TRAP features for LVCSR of meeting data, Martigny, CH, 2004, p. 1
 Motlíček, P., Černocký, J.: Multimodal Phoneme Recognition of Meeting Data, In: 7th International Conference, TSD 2004 Brno, Czech Republic, September 2004 Proceedings, Brno, CZ, Springer, 2004, p. 379-384, ISBN 3-540-23049-1, ISSN 0302-9743
 Motlíček, P., Černocký, J.: Multimodal Phoneme Recognition of Meeting Data, In: Lecture Notes in Computer Science, Vol. 2004, No. 3206, DE, p. 6, ISSN 0302-9743
 Motlíček, P.: Visual Feature Extreaction for Phoneme Recognition of Meetings, Brno, CZ, UPGM FIT VUT, 2004, p. 14
 Potúček, I., Rigoll, G., Wallhoff, F., Zobl, M.: Dynamic Tracking in Meeting Room Scenarios Using Omnidirectional View, In: 17th International Conference on Pattern Recognition (ICPR 2004), Cambridge, GB, IEEE CS, 2004, p. 933-936, ISBN 0-7695-2128-2
 Potúček, I., Sumec, S., Španěl, M.: Participant activity detection by hands and face movement tracking in the meeting room, In: 2004 Computer Graphics International (CGI 2004), Los Alamitos, US, IEEE CS, 2004, p. 632-635, ISBN 0-7695-2717-1
 Potúček, I., Španěl, M.: Face Detection in Meeting Room Using Omni-directional View, In: AMI/PASCAL/IM2/M4 workshop, Martigny, CH, IDIAP, 2004, p. 1-1
 Schwarz, P., Matějka, P., Černocký, J.: Phoneme Recognition from a Long Temporal Context, In: poster at JOINT AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Martigny, CH, IDIAP, 2004, p. 1-1
 Schwarz, P., Matějka, P., Černocký, J.: Towards Lower Error Rates in Phoneme Recognition, In: Proceedings of 7th International Conference Text,Speech and Dialoque 2004, Brno, CZ, Springer, 2004, p. 8, ISBN 3-540-23049-1
 Schwarz, P., Matějka, P., Černocký, J.: Towards Lower Error Rates in Phoneme Recognition, In: Lecture Notes in Computer Science, Vol. 2004, No. 3206, DE, p. 8, ISSN 0302-9743
 Sumec, S.: Multi Camera Automatic Video Editing, In: Proceedings of ICCVG 2004, Warsaw, PL, Kluwer, 2004, p. 935-945, ISBN 1-4020-1503-8
 Sumec, S.: Multi View Person Localization, In: Proceedings of the 10th Conference and Competition STUDENT EEICT 2004, Brno, CZ, VUT v Brně, 2004, p. 5, ISBN 80-214-2635-7
 Sumec, S.: Simulation of Parallel Ray Tracing, In: Proceedings of 38th International Conference MOSIS'04, Ostrava, CZ, MARQ, 2004, p. 6, ISBN 80-85988-98
 Szőke, I.: Speech units automatically generated by ergodic hidden Markov model, In: Proceedings of 10th Conference and Competition STUDENT EEICT 2004, Brno, CZ, FEKT VUT, 2004, p. 5
 Zemčík, P., Herout, A., Bryan, L., Tupec, P., Fučík, O.: Particle rendering pipeline in DSP and FPGA, In: Proceedings of Engineering of Computer-Based Systems, Los Alamitos, US, IEEE CS, 2004, p. 361-368, ISBN 0-7695-2125-8
 Zemčík, P., Sumec, S., Potúček, I., Španěl, M., Herout, A., Pečiva, J.: Summary of Image/Video Processing for AMI Project in Brno, In: Poster at MLMI'04 workshop, Martigny, CH, IDIAP, 2004, p. 1-1
2003Burget, L., Černocký, J.: Recognition of Speech with Non-random Attributes, In: 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings, České Budějovice, CZ, Springer, 2003, p. 6, ISBN 3-540-20024-X, ISSN 0302-9743
 Černocký, J.: Temporal processing for feature extraction in speech recognition, Vědecké spisy VUT, Brno, CZ, VUTIUM, 2003, p. 1-30, ISBN 80-214-2395-1
 Grézl, F.: Effect of normalization on TRAP based systems in ASR, In: Proc. 13th International scientific conference Radioelektronika 2003, Brno, CZ, UREL FEKT VUT, 2003, p. 128-131, ISBN 80-214-2383-8
 Grézl, F.: Local time-frequency operators in TRAPs for speech recognition, In: 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings, České Budějovice, CZ, ZČU v Plzni, 2003, p. 269-274, ISBN 3-540-20024-X, ISSN 0302-9743
 Heřmanský, H., Matějka, P., Schwarz, P.: On Use of Temporal Dynamics of Speech for Language Identification, In: Proceedings of Language Recognition Workshop 2003, NIST Gaithersburg, MD USA, US, 2003, p. 56-62
 Jenderka, P., Vícha, T.: Voice Activity Detection in Multimodal Meeting Manager, In: Proceedings of 9th Conference and Competition STUDENT EEICT 2003 Volume 3, Brno, CZ, FEKT VUT, 2003, p. 588-592, ISBN 80-214-2379-X
 Karafiát, M., Grézl, F.: Using MATLAB for Analysis of TRAP system, In: Radioengineering, Vol. 2003, No. 4, CZ, p. 38-41, ISSN 1210-2512
 Matějka, P., Schwarz, P., Grézl, F., Černocký, J.: Phoneme Classification using Temporal Patterns, In: Proc. 13th International scientific conference Radioelektronika 2003, Brno, CZ, FEKT VUT, 2003, p. 1-4, ISBN 80-214-2383-8
 Matějka, P., Schwarz, P., Heřmanský, H., Černocký, J.: Phoneme Recognition using Temporal Patterns, In: Proc. 6th International Conference Text, Speech and Dialogue, TSD2003, Ceske Budejovice, CZ, Springer, 2003, p. 465-472, ISBN 3-540-20024-X
 Motlíček, P., Černocký, J.: All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task, In: 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings, České Budějovice, CZ, ZČU v Plzni, 2003, p. 295-300, ISBN 3-540-20024-X, ISSN 0302-9743
 Motlíček, P., Černocký, J.: Autoregressive Modeling based Feature Extraction for Aurora3 DSR Task, In: Proc. EUROSPEECH 2003, Geneva, CH, IDIAP, 2003, p. 1801-1804, ISSN 1018-4074
 Motlíček, P., Černocký, J.: Time-domain based Temporal Processing with Application of, In: Proc. EUROSPEECH 2003, Geneva, CH, IDIAP, 2003, p. 821-824, ISSN 1018-4074
 Motlíček, P.: Derivation of TRAPs in Auditory Domain, In: Proceedings of 9th Conference and Competition STUDENT EEICT 2003, Brno, CZ, Děkanát FEKT VUT, 2003, p. 598-602, ISBN 80-214-2379-X
 Motlíček, P.: Derivation of TRAPs in Auditory Domain, In: Proceedings of the International Conference and Competition, Brno, CZ, FEKT VUT, 2003, p. 315-319, ISBN 80-214-2401-X
 Motlíček, P.: Modeling of Spectra and Temporal Trajectories in Speech Processing, PhD thesis, Brno, CZ, FIT VUT, 2003, p. 1-138
 Motlíček, P.: Modeling of Spectra and Temporal Trajectories in Speech Processing, In: Sborník příspěvků a prezentací akce Odborné semináře 2003 , Brno, CZ, UREL FEKT VUT, 2003, p. 28
 Potúček, I.: Person Tracking Using Omnidirectional View, In: Proceedings of the 9th conference STUDENT EEICT 2003, Brno, CZ, VUT v Brně, 2003, p. 603-607, ISBN 80-214-2379, ISSN 0572-3043
 Potúček, I.: Tracking movement objects in sequence pictures, In: ElectronicsLetters.com , Vol. 2003, No. 2, Brno, CZ, p. 1-15, ISSN 1213-161X
 Schwarz, P., Matějka, P., Černocký, J.: Recognition of Phoneme Strings using TRAP Technique, In: Proceedings of 8th International Conference Eurospeech, Geneve, CH, ISCA, 2003, p. 1-4, ISSN 1018-4074
 Schwarz, P.: Would You Like To Make Your Programs Understand Human Voice?, In: Proceedings of 9th Conference STUDENT EEICT 2003, Brno, CZ, FEKT VUT, 2003, p. 231-235, ISBN 80-214-2379-X
2002Černocký, J.: Temporal processing for feature extraction in speech recognition, habilitation thesis, Brno, CZ, 2002, p. 80
Official website: