April 12, 2024

This robot predicts when you will smile and then grins right back

Comedy clubs are my favorite weekend outings. Gather some friends, grab a few drinks and when a joke lands on all of us, there’s a magical moment where our eyes meet and we share a cheeky grin.

Smiling can make strangers the dearest friends. It stimulates fun Hollywood plots, mends broken relationships and is inextricably linked to vague, warm feelings of joy.

At least for people. For robots, their attempts at genuine smiles often fall into the uncanny valley – close enough to resemble a human, but causing a hint of discomfort. Logically you know what they are trying to do. But gut feelings tell you that something is wrong.

It could be the timing. Robots are trained to mimic the facial expression of a smile. But they don’t know when to start grinning. When people connect, we genuinely smile together, without any conscious planning. Robots need time to analyze a person’s facial expressions and reproduce a grin. For a human being, even a millisecond of delay makes the hairs on the back of your neck stand up – just like in a horror movie, something feels manipulative and wrong.

Last week, a team from Columbia University showed off an algorithm that teaches robots to share a smile with their human operators. The AI ​​analyzes small facial changes to predict the operators’ facial expressions about 800 milliseconds before they happen – just enough time for the robot to grin back.

The team trained a soft robotic humanoid face named Emo to anticipate and match the expressions of its human companion. With a blue colored silicone face, Emo looks like a science fiction alien from the 60s. But he happily grinned along with his human partner on the same ’emotional’ wavelength.

Humanoid robots are often clumsy and stilted when communicating with humans, wrote Dr Rachael Jack of the University of Glasgow, who was not involved in the study. ChatGPT and other major language algorithms can already make an AI’s speech sound human, but nonverbal communication is difficult to replicate.

Programming social skills — at least for facial expression — into physical robots is a first step toward helping “social robots join the human social world,” she wrote.

Under the hood

From robotaxis to robo-servers that bring you food and drinks, autonomous robots are increasingly entering our lives.

In London, New York, Munich and Seoul, autonomous robots speed through chaotic airports to help customers check in, find a gate or recover lost luggage. In Singapore, several two-meter-tall robots with 360-degree vision roam an airport to identify potential security problems. During the pandemic, robot dogs enforced social distancing.

But robots can do more. For dangerous jobs — such as clearing the wreckage of destroyed homes or bridges — they could pioneer rescue efforts and increase safety for first responders. As the world’s population continues to age, they can help nurses support the elderly.

Today’s humanoid robots are cartoonishly cute. But the most important ingredient for robots to enter our world is trust. As scientists build robots with increasingly human-like faces, we want their expressions to match our expectations. It’s not just about mimicking a facial expression. A genuine shared “yeah I know” smile over a cringe joke forms a bond.

Nonverbal communication – expressions, hand gestures, body postures – are tools we use to express ourselves. ChatGPT and other generative AI already allow machines to “communicate in video and verbally,” says study author Dr. Hod Lipson. Science.

But when it comes to the real world – where a look, a wink and a smile can make all the difference – it’s “a channel that’s missing right now,” Lipson said. “Smiling at the wrong time can be counterproductive. [If even a few milliseconds too late]it feels like you might be giving in.

Say Cheese

To get robots to take non-verbal action, the team focused on one aspect: a shared smile. Previous studies have pre-programmed robots to mimic a smile. But because they are not spontaneous, this causes a small but noticeable delay and makes the grin look fake.

“There are a lot of things involved in nonverbal communication” that are difficult to quantify, Lipson said. “The reason we have to say ‘cheese’ when we take a picture is because smiling on demand is actually quite difficult.”

The new research focused on timing.

The team has developed an algorithm that anticipates a person’s smile and simultaneously makes a human-like animatronic face grin. The robot face, called Emo, has 26 gears – think artificial muscles – wrapped in a stretchy silicone ‘skin’. Each gear is attached to the main robot skeleton with magnets to move the eyebrows, eyes, mouth and neck. Emo’s eyes have built-in cameras to record the environment and monitor eyeball movements and blinks.

On his own, Emo can track his own facial expressions. The goal of the new study was to help interpret the emotions of others. The team used a trick that any introverted teen might know: they asked Emo to look in the mirror to learn how to operate the gears and form a perfect facial expression, like a smile. The robot gradually learned to match its facial expressions to motor commands, for example “lifting the cheeks.” The team then removed any programming that could potentially stretch the face too much, damaging the robot’s silicone skin.

“Turns out…[making] a robot face that can smile was incredibly challenging from a mechanical point of view. It’s harder than making a robot hand,” Lipson said. “We are very good at recognizing an inauthentic smile. So we are very sensitive to that.”

To combat the uncanny valley, the team trained Emo to predict facial movements using videos of people laughing, surprised, frowning, crying, and making other expressions. Emotions are universal: when you smile, the corners of your mouth curl into a crescent moon. When you cry, the eyebrows knit together.

The AI ​​analyzed facial movements from each scene frame by frame. By measuring the distances between the eyes, mouth and other ‘facial landmarks’, telltale signals were found that correspond to a particular emotion. For example, an upturned corner of the mouth suggests a hint of a smile, while a downward movement can indicate a smile. descend into a frown.

Once trained, the AI ​​took less than a second to recognize these facial landmarks. When powering Emo, the robot face could anticipate a smile based on human interactions within a second, so it grinned with its participant.

To be clear: the AI ​​does not ‘feel’. Rather, it behaves as a human would when chuckling at a funny stand-up with a genuine-looking smile.

Facial expressions aren’t the only cues we notice when interacting with people. Subtle head shakes, nods, raised eyebrows or hand gestures all make an impression. Regardless of culture, ‘eh’, ‘ahhs’ and ‘likes’ (or their equivalents) are integrated into everyday interactions. For now, Emo is like a baby that has learned to laugh. It doesn’t understand other contexts yet.

“There’s a lot more to do,” Lipson said. We are only at the beginning of non-verbal communication for AI. But “if you think you’re interested in getting started with ChatGPT, just wait until these things become physical and all bets are off.”

Image credits: Yuhang Hu, Columbia Engineering via YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *