The research field, covered by this thesis is strongly related to recognition and modelling of human behavior in Human-Computer-Interaction environments, using face analysis as input modality. Research focuses on those cases where no specific knowledge regarding the set up, or specialized equipment exists, apart from simple hardware, like a common web-camera. Normally, such systems are based on admissions regarding user position, camera parameters or specific hardware. Beyond the effort to avoid such admissions, one of the basic principles of this thesis was research on a series of components, not statically positioned in the architecture, but dynamically emphasized throughout each process. Each component architecture has been worked on independently, aimed at non intrusive environments, encouraging spontaneity in movements, as well as unpretending lighting conditions and background. One of the large challenges towards these directions, on a second level, was modelling of extracted facial data, in order to train systems that would try to imitate human perception in terms of engagement in human-computer interaction scenarios. Experimental results on "demanding" datasets (annotated and developed for the purposes of this research) highlighted the prospect of employing non intrusive mechanisms for inferring engagement based on non verbal communication, using face analysis.
|