Non-Verbal Gestures For Humanoid Robots

The goal of my Bachelor’s thesis was to establish a more variable communication with humanoid robots. Thus, I developed a method of dynamically expressing emotions and intentions associated with non-verbal gestures.

Tools

  • Python
  • Aldebaran Choregraphe

Expression Categories

Observing the use of gestures during common and everyday communication suggested that many of the used gestures serve specific but often similar purposes. Thus, I had the idea to consider whole categories of articulation, not just a single gesture.

Therefore, I classified more than 50 commonly used gestures into their expressive meaning and compared them based on their basic expressivity. This led to the definition of six expression categories: Positive / Acceptance, Negative / Denial, Incomprehension, Neutral / Waiting, Introduction / Greeting and Ending / Goodbye of a conversation.

Expressivity

I decided to build the concept around a measure for expressivity that ranges from 0 to 1 for every expression category.

To increase an expression, it is not enough to purely increase a specific gesture. A timid waving for example only partially increases its power by just repeating it more often and faster. Simple exaggerations may be perceived as unnatural or even preposterous. To amplify the expressivity, one might also use different gestures that still convey the same meaning but in a more expressive way. For example, in case of Positive / Acceptance switching from a simple nod to a high-five or cheering can increase the expressivity of the communication.

Below is an example for some selected gestures of the expression category Incomprehension. Touching your chin with your hand shows less incomprehension than scratching your head in confusion or even explicitly showing your lack of understanding by shrugging your shoulders.

The diagram shows the expressivity ranges of four gestures for showing incomprehension.

Besides the order, all gestures have their individual range on the expressivity spectrum which is defined by a gesture’s inner parameters.

Selecting A Gesture

The concept and the implemented system allows the user to only set one meta-parameter, the expressivity, to select one or more suitable gestures. For instance, if the user would like to show Incomprehension with the expressivity of 0.8, there are two appropriate options: Scratching your head and shrugging. In case of multiple choices, the system has to decide for a gesture either randomly or depending on the actual context, which was not covered in my work. 

Inner Gesture Parameters

In addition to determining a suitable gesture, the expressivity also influences inner gesture parameters, such as Frequency, Spatial Extent, Temporal Extent, Acceleration, and Continuity. When going back to the waving example this becomes clearer: Waving that is faster and takes up more space usually has more expressive power than a very slow and small wave.

I also explained the gesture generation and selection process in the video at the beginning that I made as part of a research publication.

Implementation

I implemented the system for the humanoid robot Nao by Aldebaran Robotics, now called SoftBank Robotics, using their software Choregraphe. Choregraphe is a graphical programming tool  with a simulation software that allows the user to specify the robot’s behaviour. 

To generate a gesture, the user only has to define an expressivity value and one of the aforementioned expression categories. After that the value is used to determine one or multiple suitable gestures. In case of multiple solutions, one is picked randomly. 

 

It shows a window in Choregraphe that let's the user choose an expressivity value and a feedback type.

For each gesture, I defined several, often two or three, expressivity levels determined by the inner gesture parameters. To create the gestures, I used the animation mode that let me move the Nao’s joints to a desired position and store it as a keyframe. The keyframes were then used to interpolate a continuous movement. Since I had to create the gestures for all levels by hand, I recorded a total of 53 gestures using the keyframe approach.

Lessons Learned And Further Information

To write this article, I’ve looked back at my thesis for the first time after many years. Especially since I now supervise students myself and read their theses, I was very curious to see how I think about it now. There are obviously some things that I might do differently now, like classifying and rating gestures based on “common knowledge” and using Choregraphe. Choregraphe was great for trying out quick behaviours and gestures but it was not suitable for a rather complex system that requires some calculation and has a lot of different cases. However, back then I was not as comfortable with programming as I am now which led me towards Choregraphe instead of pure Python. But overall I’m still happy with the result of my thesis and I think I did a good job regarding it was my first scientific work, wich also related in two publications at the IEEE/ACM HRI 2014 conference.

André Viergutz, Tamara Flemisch, and Raimund Dachselt. 2014. Increasing the expressivity of humanoid robots with variable gestural expressions. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction (HRI ’14). ACM, New York, NY, USA, 314–315.
DOI: https://doi.org/10.1145/2559636.2559840

Tamara Flemisch, André Viergutz, and Raimund Dachselt. 2014. Easy authoring of variable gestural expressions for a humanoid robot. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction (HRI ’14). ACM, New York, NY, USA, 328. DOI: https://doi.org/10.1145/2559636.2559786