Emotion Expression Humanoid Robot
WE-4RII (Waseda Eye No.4 Refined II)

  1. Objective
  2. Hardware Overview
  3. Emotional Expression
  4. Mental Modeling
  5. Demonstration Videos
  6. Previous Studies
  7. Acknowledgment

1. Objective

We have been developing the Emotion Expression Humanoid Robots since 1995 in order to develop new mechanisms and functions for a humanoid robot having the ability to communicate naturally with a human by expressing human-like emotion.

In 2003, 9-DOFs Emotion Expression Humanoid Arms were developed to improve the emotional expression. The arms were integrated with WE-4 (Waseda Eye No.4) to develop the Emotion Expression Humanoid Robot WE-4R (Waseda Eyes No.4 Refined) that could express its emotions by using its facial expressions, torso, and arms. In 2004, we have developed the WE-4RII (Waseda Eye No.4 Refined II)by integrating the anthropomorphic robot hand RCH-1 (RoboCasa Hand No. 1) to WE-4R. RCH-1 has 6-DOFs and abilities for emotion expression, grasping and tactile sensing.

2. Hardware Overview

Fig. 1 and Fig. 2 present the hardware overview of the Emotion Expression Humanoid Robot WE-4RII. It has 59-DOFs (Hands:12, Arms:18, Waist: 2, Neck: 4, Eyeballs: 3, Eyelids: 6, Eyebrows: 8, Lips: 4, Jaw: 1, Lungs: 1) and a lot of sensors which serve as sense organs (Visual, Auditory, Cutaneous and Olfactory sensation) for extrinsic stimuli. Descriptions of each part are as follows.

Fig. 1 WE-4RII (Whole View)

Fig. 1 WE-4RII (Whole View)

Fig. 2 WE-4RII (Head Part)

Fig. 2 WE-4RII (Head Part)

2.1 Eyeballs and Eyelids

The eyeballs have 1-DOF for the pitch axis and 2-DOF for the yaw axis. The maximum angular velocity of eyeballs is similar to a human with 600[deg/s] for the eyeballs. The eyelids have 6-DOF. WE-4RII can rotate its upper eyelid in order to be able to express using the corner of robot's eye. The maximum angular velocity of opening and closing eyelids is similar to a human with 900[deg/s] for the eyelids. Furthermore, this robot can blink within 0.3[s], which is as fast as a human does.

For miniaturization of the head part, we newly developed an Eye Unit that integrated eyeballs parts and eyelids parts. Moreover, in the Eye Unit of WE-4RII, the eyeball pitch axis motion mechanically synchronizes opening and closing upper eyelid motion. Therefore, we can control coordinated eyeball-eyelids motion by hardware.

Fig. 3 Eyeunit

Fig. 3 Eyeunit

2.2 Neck

WE-4RII's neck has 4-DOF, which are the upper pitch, the lower pitch, the roll and the yaw axis. WE-4RII can stretch and pull its neck using the upper and lower DOF like a human. The maximum angular velocity of each axis is similar to a human's at 160[deg/s].

2.3 Arms

WE-4RII has 9-DOFs Emotion Expression Humanoid Arms. The arm consists of a base shoulder part (pitch and yaw axis), a shoulder part (pitch, yaw and roll axis), an elbow part (pitch axis), and a wrist part (pitch, yaw and roll axis). By using the 2-DOFs of the base shoulder, it can move the whole shoulder up and down, back and forth. This enables WE-4RII to do such movements as squaring its shoulders when angry or shrugging its shoulders when sad. Therefore, WE-4RII can express its emotions effectively by using its arms.

Fig. 4 9-DOFs Arm

Fig. 4 9-DOFs Arm

2.4 Hands

As for a hand part, development was performed at the joint laboratory for reserch on humanoid & personal robotics ROBOCASA. Anthropomorphic robot hand RCH-1(RoboCasa Hand No.1) was designed and developed for WE-4RII at SSSA (Scuola Superiore Sant' Anna) ARTS Lab in Italy.

Fig. 5 RCH-1

Fig. 5 RCH-1

(1) Finger Mechanism

For driving the finger, tendon drive mechanism is used as Shown in Fig. 6. A wire runs from the fingertip to the root of the mechanism taking one turn around each pulley. When the actuator pulls the wire, the finger automatically grasp the object as shown Fig. 7, because pulley can rotate freely each other. This mechanism enables to conform to objects of any shape softly and gently without using complicated control.

Fig. 6 Finger Mechanism

Fig. 6 Finger Mechanism

Fig. 7 Grasping Mechanism

Fig. 7 Grasping Mechanism

(2) Abduction/Adduction Mechanism

In order to grasp the spherical object, humans abduct the thumb opposite to the small finger, and in case of the cylinderical grasp, we adduct the thumb opposite to the middle finger. This abduction and adduction mechanism make it possible to grasp the several shape object effectively.

(3) Tactile Sensor

RCH-1 has three tactile sensors; distributed on/off contact sensor, FSR, 3D force sensor. Distributed on/off contact sensor is the switch consisted of thin sheets. It is arranged at 16 places; 15 parts for inside of the finger, 1 part for palm shown in Fig. 8. FSR is the same one used for the head part of WE-4. By using two layerd sheet, we can recognize not only the magnitude of the force, but also the difference of the touching manner that are "Push", "Stroke", "Hit".and 3D force sensor which can measure the fingertip force is implemented in the fingertips of the thumb and index finger.

Fig. 8 RCH-1 Tactile Sensor

Fig. 8 RCH-1 Tactile Sensor

2.5 Trunk

WE-4RII has 2-DOF waist composed by pitch and yaw axes. By using the waist motion, WE-4RII can product emotional expression with not only neck but also the upper-half part of its body.

2.6 Facial Expression Mechanisms

WE-4RII expresses its facial expression using its eyebrows, lips, jaw, facial color and voice. The eyebrows consist of flexible sponges, and each eyebrow has 4-DOF.

We used spindle-shaped springs for WE-4RII's lips. The lips change their shape by pulling from 4 directions, and WE-4RII's jaw that has 1-DOF opens and closes the lips.

For facial color, we used red and blue EL (Electro Luminescence) sheets. We applied them on the cheeks. WE-4RII can express red and pale facial colors.

For the voice system, we used a small speaker that was set in the jaw. The robot voice is a synthetic voice made by LaLaVoice 2001 (TOSHIBA Corporation).

Fig. 9 Facial Expression Mechanism

Fig. 9 Facial Expression Mechanism

2.7 Sensors

(1) Visual Sensation

WE-4RII has two color CCD cameras in its eyes. The images from its eyes are captured to a PC by an image capture board. WE-4RII can recognize any color as the targets and it can recognize eight targets at the same time. After calculating the gravity and area of the targets , WE-4RII can follow them with the eye, the neck and the waist. This makes it possible to follow the target with any collor in the three dimension space.

Fig. 10 Visual Sensor

Fig. 10 Visual Sensor

(2) Auditory Sensation

We used two small condenser microphones as the auditory sensation. WE-4RII can localize the sound directions from the loudness between the right and the left.

(3) Cutanious Sensation

WE-4RII has tactile and temperature sensations in the human cutaneous sensation. We used the FSR (Force Sensing Resistor) as tactile sensation FSR is able to detect even very weak forces, and is a thin and light device. We devised a method for recognizing not only the magnitude of the force, but also the difference of the touching manner that are "Push", "Stroke", "Hit", by using a 2 layers structure with FSR. On the other hand, WE-4RII has a Thermistor the temperature sensor. FSRs are also installed on the palms to detect whether it has been contacted or not.

(4) Olfactory Sensation

We used the four semiconductor gas sensors as the olfactory sensation. We set them in WE-4RII's nose. WE-4RII can recognize the smells of alcohol, ammonia and cigarette smoke.

Fig. 11 Olfactory Sensor

Fig. 11 Olfactory Sensor

2.8 System Configuration

Fig. 12 shows the total system configuration of WE-4RII. We use three personal computers (PC/AT compatible) connected to each other by Ethernet.

PC1 (Pentium 4, 2.66[GHz], OS: Windows XP) obtains and analyzes the output signals from the olfactory and cutaneous sensor by using 12 bits A/D aquisition board. Andmore, analizes the sounds from microfones by soundboard. We determine the mental state according to these information of stimuli, sensing data of the hands from the PC2, and the visual images from the PC3. Moreover, controls all DC motors except the hands and sends the data to PC2 for controling the hands at the sametime. PC2 (Pentium III, 1.0[GHz], OS: Windows 2000) obtains and analyzes the out put signals from the RCH-1's sensing data by 12 bits A/D aquisition board and digital I/O board. Analized data is sent to the PC1 and controls the finger position of the RCH-1 based on the data sent by PC1. PC3 (Pentium 4, 3.0[GHz], OS: Windows XP) captures the visual images from the CCD cameras and then caluculates the center of gravity and brightness of the target, and sends them to PC1.

Fig. 12 System Configuration

Fig. 12 System Configuration

3. Emotional Expressions

We use the Six Basic Facial Expressions of Ekman in the robot's facial control, and have defined the seven facial patterns of "Happiness", "Anger", "Disgust", "Fear", "Sadness", "Surprise", and "Neutral" emotional expressions. The strength of each emotional expression is variable by a fifty-grade proportional interpolation of the differences in location from the "Neutral" emotional expression. The speed of the arm movement is changed according to the emotion of the robot. Therefore, the emotion of the robot can be expressed by both the posture and the speed of the arms. WE-4RII has the emotional expression patterns shown in Fig. 13.

Fig. 13a Happiness Fig. 13a Happiness Fig. 13b Fear Fig. 13b Fear
(a) Happiness (b) Fear
Fig. 13c Suprised Fig. 13c Suprised Fig. 13d Sadness Fig. 13d Sadness
(c) Surprise (d) Sadness
Fig. 13e Anger Fig. 13e Anger Fig. 13f Disgust Fig. 13f Disgust
(e) Anger (f) Disgust
Fig. 13g Neutral Fig. 13g Neutral
(g) Neutral
Fig. 13h Anger Fig. 13h Happiness Fig. 13h Happiness Fig. 13h Surprise
Fig. 13h Happiness Fig. 13h Sadness
(h) New Patterns

Fig. 13 WE-4RII Emotion Expression

4. Mental Modeling

4.1 Approach

The Mental Dynamics, which is the mental transition caused by the internal and external environment of the robot, is extremely important in the emotional expression. Therefore, in construction of the mental model, we considered that the human brain model had a three-layered model that consisted of the reflex, emotion and intelligence. And, we are approaching the mental model from the reflex. Also, we divided the emotion into "Learning System", "Mood" and "Dynamic Response" according to the working duration.

Moreover, in order to realize bilateral interaction between human and robot, we based our research on "A. H. Maslow's Hierachy of Needs", and introduced the Need Model consisting of the "Appetite", the "Need for Security", and the "Need for Exploration". Consquently, the robot can behave according to its need.

Fig. 14 Brain Dynamics

Fig. 14 Brain Dynamics

4.2 Information Flow

WE-4RII changes its mental state according to the external and internal stimuli, and expresses its emotion using facial expressions, facial color and body movement. We introduced an information flow into the robot shown in Fig. 15. There are two big flows. The one is the flow caused from the external environment. And, the other is the flow caused from the robot internal state. Furthermore, we introduced the Robot Personality because each human has deferent personality. The Robot Personality consists of the Sensing Personality and the Expression Personality. The need and the emotion are a two-layered structure, and the need is in a lower layer than the emotion because we thought that the need was nearer to the instinct than the emotion. Furthermore, the need and emotion affect each other through the Sensing Personality.

Fig. 15 Information Flow of the Mental Modeling

Fig. 15 Information Flow of the Mental Modeling

4.3 Personality and Learning System

The Robot Personality consists of the Sensing Personality and the Expression Personality. The former determines how a stimulus works the mental state. And, the later determines how the robot expresses its emotion. We can easily assign these personalities. Therefore, it's possible to easily obtain a wide variety of the Robot Personalities. Moreover, we introduced the "Learning System" in order for the robot to learn the experiences and construct its personality based on its experiences dynamically.

4.4 Emotion Vector and Mood Vector

We adopted the 3D mental space, which consists of a pleasantness axis, an activation axis and a certainty axis, shown in Fig. 16. The vector E named the "Emotion Vector" expresses the mental state of WE-4. Furthermore, we newly introduce the "Mood Vector" M that consists of a pleasantness axis and an activation axis.

The pleasantness component of the Mood Vector changes by the current mental state. But, in order to describe the activation component of the Mood Vector, we introduced the internal clock that is a kind of automatic nerve system.

4.5 Equations of Emotion

The Emotion Vector E is described the Equations of Emotion if the robot senses the stimuli. We considered that the mental dynamics which is a transition of a human mental state might be expressed by similar equations to the equation of motion. Therefore, we expanded the equations of emotion into the second order differential equation which modeled on the equation of motion. The robot can express the transient state of the mental state after the robot senses the stimuli from the environment. We can obtain the complex and various mental trajectories.

Finally, we mapped out 7 different emotions in the 3D mental space as in Fig. 17. WE-4 determines the emotion by the Mental Vector passing each region.

Fig. 16 Mental Space

Fig. 16 Mental Space

Fig. 17 Emotional Mapping

Fig. 17 Emotional Mapping

4.6 Need Model

Bilateral interaction is important for natural communication between human and robot. We considered that active behavior of robot was necessary to realize bilateral interaction. Therefore, we introduced the Need Model to the robot mental model. The need state of a robot is described by the matrix N named the "Need Matrix". The "Need Matrix" is described as a first order difference equation. Though the robot need consists of the "Appetite", the "Need for Security" and the "Need for Exploration" in this study, the need matrix is expandable depending on the number of need factors.

(1) Appetite

The appetite is based on the total consumed energy that is described as the sum of the basal metabolism energy and output energy. We considered that metabolism energy was determined by the robot's emotional state, and the output energy of the robot was determined by internal or external stimuli such as the total electric current.

(2) Need for Security

The need for security is a type of the defense behavior. The defense reflex of withdrawal from strong stimuli is the similar reaction.However, the need for security generates the defense behavior for long-term stimuli. When a robot senses dangerous stimuli from the environment for a long period, the robot can withdraw from the dangerous stimuli or express a defense behavior even if the stimuli are too weak to cause the defense reflex. We realized the Need for Security by learning the position and strength of the stimuli when a robot felt stimuli from the environment.

(3) Need for Exploration

When humans and animals encounter a new situation or a new object, they express exploratory behavior out of their curiosity because the need for exploration is high. We realized the need for exploration by learning of the relation between the visual information and target property.

(4) Behavior by Need

The robot can actively generate and express its behavior based on its need in order to satisfy its need. And, the robot with need continues to exhibit the same behavior until the robot satisfies its need as a result of active behavior. We also considered that the need was one of the internal stimuli to the robot. By assigning the Sensing Personality for the need, the need affect the mental state.

4.7 Memory Model

Human memory has the relations of the mood state-dependency and the mood congruency with their mood. Humans easily retrieve the same memory in the same mood where the memory was stored. This is the mood state-dependency. On the other hand, the mood helps retrieving a memory corresponding to the same mood, known as the mood congruency. Basically, humans tend to retrieve pleasant memories if they are pleasant and conversely, unpleasant memories if they are unpleasant. Moreover, human performance is related to their activation level. If an activation level becomes too high or too low, active human performance becomes impossible. The best human performance comes at a medium activation level. We developed an encoding model by using the self-organizing map and a retrieval model by using the chatic neural networks. The robot can recognize a stimulus according to the mood, the activation level and the appetite.

5. Demonstration Videos

Click the following pictures to see the demonstration videos.

Emotion Facial Expressions
34 [sec]
MPEG format
5.76 [MB]

The robot expresses 7 basic emotions using facial expression.
Emotion Expression by Upper-half Body
57 [sec]
MPEG format
9.67 [MB]

The robot expresses emotions using facial expression, body, arms, and hands motion.
4 Sensations
72 [sec]
MPEG format
12.0 [MB]

The robot reacts to stimuli.
Various Behaviers
60 [sec]
MPEG format
10.0 [MB]

The robot can show human-like motions. For example, it can do exercises with a dumbbell.
Concsiousness
41 [sec]
MPEG format
6.89 [MB]

The robot reacts to the stimulus with the highest consciousness.
If you cannot watch the movies, please save mpeg files in your computer first.

6. Previous Studies

WE-4R
WE-4R (2003)

We realized more effective emotion expression by using 9-DOFs emotion expression humanoid arms which can move the whole shoulder squaring its shoulder when angry or shrugging its shoulder when sad. Moreover by introducing the need, WE-4R became to enable to output the active behavior from the robot side.

Detailed information of WE-4R

WE-4
WE-4 (2002)

We realized the miniaturization of the robot system by developing the Eye Unit integrating the eyes and eyelids mechanisms. We also improved the eyebrows' mechanism and added the 2-DOFs waist. WE-4R can follow the visual target with the upper body and express its emotion. Moreover, WE-4R can determine its mental state based on the mental model with the Learning System, the Mood and the Second order Equations of Emotion. The emotional expression by WE-4 was improved.

Detailed informatin of WE-4

WE-3RV
WE-3RV (2001)

We newly adeed new eyelids' system, rea and pale facial color expression function, voice system, skin with human-like feeling for improving emotional expression funcition. We also introduced the robot personality. WE-3RV can express various robot personalities caused by external stimuli.

Detailed informatin of WE-3RV

WE-3RIV
WE-3RIV (2000)

We added olfactory sensation and red facial color expression function to WE-3RIV for improving both input and output functions. Moreover, We introduced the Equations of Emotion to WE-3RIV, so that it can express more human-like emotional behavior than the previous robots.

WE-3RIII
WE-3RIII (1999)

We added the auditory sensation and the cutaneous sensation that consists of tactile and temperature sensations to WE-3RIII. Besides, we improved and rebuilt psychological model. WE-3RIII can express its emotions caused by external stimuli.

WE-3RII
WE-3RII (1998)

By adding the eyebrows, lips and the jaw to the basic mechanism of WE-3R, WE-3RII was further developed so that a more human-like expression could be expressed. At the same time, we have introduced a facial expression control method based on three independent parameters of a simple psychological model for WE-3RII.

WE-3R
WE-3R (1997)

WE-3R realized the human-like function of being able to adjust to the brightness of an object by developing the eyelids to WE-3 developed in 1996. In addtion, by developing the ears, WE-3R can detect the sound direction and follow the sound source.

WE-3
WE-3 (1996)

WE-3 realized the following motion in a 3D space using coordinated head-eye motion with V.O.R (Vestibular-Ocular Reflex), as well as following a motion in depth direction using the angle of convergence between the two eyes.

WE-2
WE-2 (1995)

WE-2 has the head rotation functions that consists of a neck and a eye. It has 4 DOF. We realized the coordinated head-eye motion using V.O.R (Vestibular-Ocular Reflex)

All activity

7. Acknowledgment

Part of this reserch was conducted at the Humanoid Robotics Institute (HRI), Waseda University. We would like to thank Italian Ministry of Foreign Affairs General Direction for Cultural Promotion and Cooporation, for its support to the establishment of the ROBOCASA laboratory and for the realization of the two artificial hands. And part of this was supported by a Grant-in-Aid for the WABOT-HOUSE Project by Gifu Prefecture. Finally, We would like to express thanks to ARTS Lab, NTT Docomo, SolidWorks Corp., Advanced Reserch Institute for Science and Engineering of Waseda University, Prof. Hiroshi Kimura for this supports to our reserch.

Back
Last Update: 2006-11-01
Copyright(C) 2004-2006 Head Robot Team/Takanishi Laboratory
All Rights Reserved.