The Future of Us: Insights from Virtually Human

The Face Puzzle: Decoding Human Perception of Digital Agents

Written by Virtually Human | December 11, 2024 7:45:35 AM Z

Research by Julija Vaitonytė, Tilburg University

Introduction to Virtual Agents

Advances in technology, such as computer graphics and machine learning, have created ample opportunities for the development of virtual agents that closely resemble humans in appearance and show increasingly human-like communicative behaviors. These advances further raise a wealth of questions, including regarding factors that contribute to the social nature of artificial entities as well as whether artificial entities are processed similarly to humans.

Research Focus

This dissertation addresses these questions by focusing on people’s perceptions of the appearance and behavior of virtual agents. We also aimed to couple subjective perceptions of observers to measurable characteristics in virtual agents. By gaining a more detailed understanding of human perceptions of virtual agents, it becomes possible to build more intuitive agents that can build trust and assist humans in healthcare and education.

Distinguishing Virtual and Human Faces

We investigated whether humans could distinguish the faces of virtual agents created using state-of-the-art methods in computer graphics and machine learning from the photographs of human faces. We also wanted to understand what features in the face were responsible for perceiving human-likeness in the face. The results showed that humans were able to tell apart the human and the virtual agent images and that two facial features were important for this. One feature was the appearance of the skin while the second feature was related to the eyes. If the skin was smooth and the eyes lacked corneal reflections (the white highlights in the eyes), the face was identifiable as an agent-like face rather than a human-like face.

Memory of Virtual vs. Human Faces

We examined memory of virtual faces and human faces, for which one of the motivations was previous research, suggesting that virtual faces are remembered less well than natural human faces. We drew on the two previously identified features in the skin and the eyes. The results of a memory study showed that when the eyes lacked corneal reflections, the face was more difficult to remember, adding to a more fine grained understanding of how specific face details affect face processing not only with respect to perception but also memory.

We used data from perception and memory experiments and investigated the time course of processing different faces using a novel statistical technique. Specifically, the technique was applied to response time data and permitted understanding how the eyes and the skin features affected people’s responses over time, even when relying on a behavioral measure. The results showed that the appearance of the skin is influential early on in processing regarding face perception and with respect to face memory, it is the appearance of the eyes that is crucial early in processing. 

The Uncanny Valley and Brain Studies

We broadened the scope and investigated the perception of virtual agents based on examining the studies on the Uncanny Valley (UV) and the brain. The UV is by far the most influential theory on how humans understand artificial entities, stating that people have an increasing liking for agents that from the outward appearance look like humans, but up until a certain point. When humans encounter an agent that is almost human like but not quite there yet, their liking for the agent drops, leading them to experience negative feelings toward the agent. While the UV has been primarily studied relying on subjective perceptions from surveys, in recent years researchers have started to use neuroscientific techniques to measure the brain’s reaction to different artificial beings, both robots and virtual agents. We systematically collected the studies in this emerging line of work that relies on the UV and the brain measures. The important finding, which ties into our work on how humans perceive and memorize virtual agent faces, was that at the level of the brain, the areas that are responsible for processing faces are less active when the face is not sufficiently human-like compared to processing real faces.

We investigated how the brain processes virtual agent and real faces over time. In particular, we compared early and late stages of processing. Previous work in the field found that the processing of virtual agent and real faces differ at a later stage specifically, from 400 ms onwards. However, the results have been mixed regarding the direction of this effect. Some studies found that the brain reacted more strongly to real faces at a late stage while other studies reported that it was the virtual agent faces that elicited a stronger response. Our results showed that the early brain response to real and virtual faces at the early stage of processing (between 140 ms – 200 ms) was virtually the same, whereas at a later stage (between 400 ms – 600 ms) the brain reacted more strongly to real human faces than agent faces. However, this effect was small, suggesting that high-quality virtual agent faces are processed similarly to human faces, even at a later stage.

Dynamic Faces and Facial Expressions

We studied dynamic faces, specifically, generation-perception link of facial expressions in human and virtual agent faces. Producing naturalistic facial expressions  for agents has been a difficult task and there is no standard method for how to do this. Recently sophisticated machine learning models have been used for generating facial expressions for agents based on large quantities of data. However, it was of interest to investigate whether a much simpler model could produce facial expressions based on human facial expression data that are understandable to humans. We proposed a model that generated new facial expressions based on a set of existing facial expressions that rearranged the order of facial states to produce new ones. We also tested such a model in experiments with human and virtual agent faces. The results showed that it was indeed possible to use a small amount of data to produce naturalistic facial expressions that can be applied to virtual agents. Moreover, we found that specific Action Units – activations of facial muscle groups – could predict observers’ perceptions of choosing a specific category of facial expression, such as happiness or surprise. 

Conclusion

Overall, this dissertation advanced novel findings in the context of perceiving highfidelity, human-like virtual agents. These findings are also directly applicable to developing future virtual agents and the methods that have been employed in this research may help advance the work on evaluating virtual agent technology.