Next: Shape Modeling Laboratory Up: Department of Computer Previous: Multimedia Systems Laboratory

Human Interface Laboratory


/ Masahide Sugiyama / Professor
/ Susantha Herath / Associate Professor
/ Michael Cohen / Assistant Professor
/ Minoru Ueda / Assistant Professor

Using our communication channels (sense organs: ears, mouth, eyes, nose, skin, etc) we can communicate each other, including between human and human, human and machine, and human and every information sources. Because of disability of the above channels in software or hardware sense, sometimes it becomes to be difficult for human to communicate. Research area of Human Interface Laboratory covers enhancement and generation of various human interface channels.

In order to advance the above research on human interface, we adopt the following research principle:

  1. theoretical Our target is human interface and our study has possibility to do try-and-error, heuristic, too practical business. Based on our experimental results, and experiences we try to establish the theory, unified insight, generalization and analytical viewpoints.
  2. practical Our target is not theory generation for theory. We extract the concept, theory in order to clarify in experimental and quantitative viewpoint.

We continued the following two main research topics in 1996:

  1. Study on Communication with Handicapped
  2. Study on Analysis and Generation of Acoustics Scene

We organized second workshop IWHIT97 on May 12th, 14th and 15th (International Workshop on Human Interface Technology 1997) which was sponsored by the International Affairs Committee of the University of Aizu. The workshop had 5 sessions (1.Virtual Acoustics, 2.Speech and Multimedia, 3.Speech and Welfare, 4.Natural Language I & II, 5.Visual Languages; 5 key note lectures and 18 lectures) and poster session by graduation research students.

We promoted 5 SCCPs for students (``Social Hyper Networking", ``Visual Language for Office Processing Software", ``Speech Dialogue System", ``Computer Music", ``Non-Verbal Communication") and 3 Joint Projects (``Study of Machine Processing of Signs Generated by Hand Movements", ``Study on Speech Recognition under Noisy Environment", ``Audio Windows: Spatialization of Synthesized Speech", ``Spatialization of Music and Hierarchical Organization of Spatial Sound Sources"), and 1 Courseware project (``Speech Processing and Speech Recognition"). One of us received a commissioned research fund from NTT Human Interface Labs. on ``Audio Window" and NTT DATA on ``Study on Speech Processing Technology and Human Interface".

We exhibited our research activities in the open campus in University Festival (Oct 18th, 19th). We promoted Lab Open House for Freshmen on April 9th, 10th and 11th.

On our research activity we presented 5 refereed papers in International Conferences and academic journals.

One of members organized working group on ``Blind and Computer" and about 30 people attended to the working group (Apr. 14th, July 14th, Oct. 13th, Nov. 10th and Feb. 2nd 1997). The topics are ``TeN-yaku Hiroba and Information Network", ``The Visually Handicapped and Computer Network Access", ``Computer Training for Beginners", ``Joining ``Amedia Fair96''" and ``Hearing/Visual Aids in New NTT Technology".

We have the homepage of Human Interface Lab to open our research and education activities to the world. http://www.u-aizu.ac.jp/labs/sw-hi/.


Refereed Journal Papers

  1. J. Murakami, S. Sugiyama, H. Watanabe, Unknown Multiple Signal Source Clustering Problem using Ergodic HMM and Applied to Speaker Classification. Proc. of ICSLP96 , Oct. 1996.

    In this paper, we consider signals originated from a sequence of sources. More specifically, the problems of segmenting such signals and relating the segments to their sources are addressed. This issue has wide applications in many fields. This report describes a resolution method that is based on an Ergodic Hidden Markov Model (HMM), in which each HMM state corresponds to a signal source. The signal source sequence can be determined by using a decoding procedure (Viterbi algorithm or Forward algorithm) over the observed sequence. Baum-Welch training is used to estimate HMM parameters from the training material. As an example of the multiple signal source classification problem, an experiment is performed on unknown speaker classification. The results show a classification rate of 79% for 4 male speakers. The results also indicate that the model is sensitive to the initial values of the Ergodic HMM and that employing the long-distance LPC cepstrum is effective for signal preprocessing.

  2. Herath, A., Hyodo, Y., Kunieda, Y., Ikeda, T., Herath, S, Bunsetsu-Based Japanese Sinhalese Tranalation System. Information Sciences , No. 90, p. 303--319, 1996.

    This paper presents the design and implementation techniques employed in a Japanese-to-Sinhalese machine translation (MT) system. The main result of this work is the successful application of Bunsetsu in generating meaningful translations for a flexible grammar language. The system has been developed considering the similarities between Japanese Bunsetsu and Sinhalese units. Such efforts are being focused on determining the minimum reasonable grammatical knowledge necessary for machine translation. The principal characteristics of the system, the translation process, problems encountered during the development stages, present status and future plans are discussed.

  3. Woodrow Barfield, Michael Cohen and Craig Rosenberg, Localization as a Function of Azimuth and Elevation. Int. J. of Aviation Psychology, Vol. 7, No. 2, pp. 123-138, 1997.

    This study was performed to investigate the accuracy of performing a localization task as a function of the use of three display formats: an auditory display, a perspective display, and a perspective-auditory display. The experimental task for the perspective and perspective-auditory displays was to judge the relative azimuth and elevation which separated a computer-generated target object from a reference object. The experimental task for the auditory display was to determine the azimuth and elevation of a sound source with respect to the listener. For azimuth estimates, there was a significant effect for type of display, with worse performance resulting from the purely auditory format. Further, azimuth judgements were better for target objects which were aligned close to the major meridian orthogonal to the viewing vector. For elevation errors, there was a main effect for the type of display with worst performance for the purely auditory condition; elevation judgements were worse for larger elevation separations independent of display condition. Finally , elevation performance was superior when target images were aligned close to the major meridian orthogonal to the viewing vector. Implications of the results for the design of spatial instruments is discussed.

Refereed Proceeding Papers

  1. Herath, A., Ikeda, T., Herath, S., Kaikhah, K., Generating Number in Japanese to Modern Sinhalese Machine Translation. Perspectives of System Information , p. 259--262, PSI. A. P. Ershov Institute of Information Systems, June 1996.

    As the Japanese to modern Sinhalese pair of languages is virtually unexplored in its machine translation prospective, this efforts are being focussed on determining the reasonable minimum of grammatical knowledge of Japanese necessary for obtaining intelligible modern Sinhalese output. This paper discusses the problem of the countability in machine translation (MT) from Japanese to modern Sinhalese and a method that extracts information relevant to countability from the Japanese text and combines it with knowledge about countability in Sinhalese.

  2. Jens Herder and Michael Cohen. Project Report: Design of a Helical Keyboard. Int. Conf. on Auditory Displays, p. 139--142, Nov. 1996.

    Inspired by the cyclical nature of octaves and helical structure of a scale, we prepared a model of a piano-style keyboard (prototyped in Mathematica), which was then geometrically warped into a left-handed helical configuration, one octave/revolution, pitch mapped to height. The natural orientation of upper frequency keys higher on the helix suggests a parsimonious left-handed chirality, so that ascending notes cross in front of a typical listener left$\rightarrow$right. Our model is being imported (via the dxf file format) into (Open Inventor/) vrml, where it can be driven by midi events, realtime or sequenced, which stream is both synthesized (by a Roland Sound Module), and spatialized by a heterogeneous spatial sound backend (including the CRE Acoustetron II and the Pioneer Sound Field Control speaker-array System), so that the sound of the respective notes is directionalized with respect to sinks, avatars of the human user, by default in the tube of the helix.

  3. Katsumi Amano, Fumio Matsushita, Hirofumi Yanagawa, Michael Cohen, Jens Herder, Yoshiharu Koba and Mikio Tohyama. Psfc: the Pioneer Sound Field Control System at the University of Aizu Multimedia Center. Ro-Man: Proc. 5th Int. Workshop on Robot and Human Communication , IEEE. p. 495--499, Nov. 1996.

    The psfc, or Pioneer Sound Field Control System, is a dsp-driven hemispherical 14-loudspeaker array, installed at the University of Aizu Multimedia Center. Collocated with a large screen rear-projection stereographic display the psfc features realtime control of virtual room characteristics and direction of two separate sound channels, smoothly steering them around a configurable soundscape. The psfc controls an entire sound field, including sound direction, virtual distance, and simulated environment (reverb level, room size and liveness) for each source. It can also configure a dry (dsp-less) switching matrix for direct directionalization. The psfc speaker dome is about 14 m in diameter, allowing about twenty users at once to comfortably stand or sit near its sweet spot.

Unrefereed Papers

  1. M. Sugiyama, On Decomposition of Superposed Voices using Speech Models. Proc. of ASJ Spring Meeting , p. 67-68, ASJ, March 1996.

Technical Reports

  1. Michael Cohen, Multimedia is for Everyone: Virtual Reality and Telecommunication Conferences. Concerts, and Cocktail Parties , 27pgs, December 20, 1996.

Grants

  1. Susantha Herath, Fukushima Prefectural Foundation for the advancement of Science and education, Environment computer activity project. 1,000,000 Yen, May 1996.

  2. Michael Cohen, NTT Human Interface Labs, donation: 600,000 Yen. May 1996.

  3. Michael Cohen, Hewlett-Packard Labs/Japan, donation: 600,000 Yen, May 1996.

  4. Michael Cohen, Fukushima Prefectural Foundation for the Advancement of Science and Education, project, 3,380,000 Yen, May 1996.

Academic Activities

  1. Masahide Sugiyama, Speech Processing Committee in IEICE and ASJ. Member. May 1996.

  2. Masahide Sugiyama, Tohoku Regional Board of IEICE, ASJ and IPSJ. Member. May 1996.

  3. Masahide Sugiyama, Committee of Sign Linguistic Technology Research. Member. May 1996.

  4. Masahide Sugiyama, Referee of IEICE and ASJ (Acoustic Society of Japan). May 1996.

  5. Masahide Sugiyama, Chairman of sessions in ASJ and IEICE conferences. May 1996.

  6. Susantha Herath, IEEE coordinator (1993.4 -). May 1996.

  7. Susantha Herath, Member of the Review Board for the International Journal of Applied Intelligence (1992.5 - ), May 1996.

  8. Susantha Herath, IWHIT'97, March 1996. Financial Chair of third International Workshop in Human Interface Technology' 97 (IWHIT'97) 3/12-14.

  9. Susantha Herath, IWHIT'97, March 1996. Co-editing the Proceedings of third International Workshop in Human Interface Technology '97 (IWHIT'97) 3/12-14.

  10. Susantha Herath, IWHIT'97, March 1996. Session chair of Natural Language Processign, Third International Workshop in Human Interface Technology '97 (IWHIT'97) 3/12-14.

  11. Susantha Herath, EMAS, Aizu university. Nov. 1996. General Chair, ECA symposium on EMAS, Aizu university 11/2.

Others

  1. Masahide Sugiyama. Received donation from NTT institute of human interface, May 1996.

  2. Masahide Sugiyama. Received donation from NTT institute, May 1996.

  3. Kenji Suganami, Bachelor Title: Sign Language Dictionary Development. The University of Aizu, 1996. Thesis Advisor: S. Herath.

  4. Tadashi Watanabe, Bachelor Title: CAI in Sign Language Learning. The University of Aizu, 1996. Thesis Advisor: S. Herath.

  5. Naito Kenichi, Bachelor Title: Abstract Generation for Newspaper Articles. The University of Aizu, 1996. Thesis Advisor: S. Herath.

  6. Bachelor Title: Virtual Reality Audio SCCP. The U. of Aizu, 1996. Thesis Advisor: Michael Cohen (with Jens Herder).

  7. Bachelor Title: Computer Music SCCP. The U. of Aizu, 1996. Thesis Advisor: Michael Cohen (with James Goodwin).



Next: Shape Modeling Laboratory Up: Department of Computer Previous: Multimedia Systems Laboratory


www@u-aizu.ac.jp
October 1997