How IIIT-H researchers used movies to make machines understand emotions.

'The researchers introduced a machine learning model that relies on a transformer-based architecture to understand and label emotions’

How IIIT-H researchers used movies to make machines understand emotions.

Engineers have been working on teaching computers how to measure and comprehend emotions from text for a very long time.

Researchers from the Centre for Visual Information Technology (CVIT) at the Indian Institute of Information Technology (Hyderabad) recently carried out an experiment where they used movies to teach machines how to interpret emotions. They were inspired to do the same and find answers themselves.

One of the researchers, Dhruv Srivastava, claims that conversations have previously been utilized in text-based emotion analysis to determine a character's state of mind.

How you are feeling?' was a study that Dhruv co-authored. With Aditya Kumar Singh and Makarand Tapaswi, they present "Learning Emotions and Mental States in Movie Scenes." It has already been approved for presentation at the 'Conference on Computer Vision and Pattern Recognition' taking place in Vancouver from June 18 to 23.

Research Specifics

Making Emotional Connections Through Data

"The researchers introduced a machine learning model that relies on a transformer-based architecture to understand and label emotions for not only each movie character in the scene but also for the overall scene itself," an executive of IIIT-H said.

The research team picked movies for their study because they include a wealth of emotional information that mirrors the intricacies found in daily life. Movies, in contrast to static visuals, are very difficult for computers to understand.

A character's feelings in a scene cannot be captured by a single name, so it is crucial to gauge their many emotions and mental states. In a single scene, a character can experience a variety of emotions, from surprise and happiness to rage and even melancholy, according to Dhruv.

A representative from IIIT-H stated, "To train their model, the team of researchers used an existing dataset of movie clips collected by Tapaswi for his previous work called MovieGraphs that provides detailed graph-based annotations of social situations depicted in movie scenes."

Through a three-step procedure that involved analyzing the entire film, identifying specific facial traits, and reading the subtitles, the machine was trained to accurately describe the emotions and mental states of characters in each scene.

"We came to the realization that incorporating multimodal data is crucial to predicting numerous emotions. Aditya added, "We were able to anticipate the corresponding mental states of the characters that are not explicitly stated in the scenes.