On the 9th of December, the Police Academy hosted an innovation day for police personnel to find out about current and future technological innovations that are of interest to law enforcement. Among these were computer vision applications for surveillance, eagle-drone interaction and R3D3. Kasper van Zon of VicarVision, Pascale van de Ven from the department of Robotics and Mechatronics, and Jeroen Linssen from the department of Human Media Interaction of the University of Twente gave a demo of the first fully integrated prototype of R3D3.
This prototype consisted of a life-size robot, which features the head of EyePi, previously developed at RAM (a different research group within the University of Twente), a Kinect for computer vision, and a virtual human. Regrettably, we didn’t get around to attaching a tablet in between the big robot’s hands, as intended. Still, the prototype was able to perform several key functions:
• Detect people and their characteristics, using computer vision
• Pick up and transform people’s speech to text using automated speech recognition (ASR)
• Control the behavior of the robot: head movements, emoting, arm movement
• Control the behavior of the virtual human: speaking, emoting
• Match questions of people to relevant answers
We wrote a small database consisting of question-answer pairs about the R3D3 project and the technologies involved. In this demo, people could interact with R3D3 by talking to it in Dutch, which would be recognized as one of the questions in its database. Then, the virtual human responded with a matching answer; or, if a question was not recognized, she replied by asking the person to rephrase that question.
Overall, reactions were positive: visitors were intrigued by all the work that went into creating this combination of technologies. They inspected how the robot head moved, what VicarVision’s computer vision told them about their appearance, and how the virtual human responded and tried to understand what they said. This was a challenge at times, especially with the crowded environment in which multiple demos were showcased. Still, the virtual human was able to give a correct response with the first attempt half of the time.
The dialogues we wrote were still very limited in nature, leading to short interactions. This is something we will address in the coming quarter, expanding the dialogue model and diversifying the possible interactions. Furthermore, we will integrate the computer vision in the dialogue model to a larger extent and perform another iteration of the robot/virtual human duo design.