Moviemakers are now looking for ways to gauge an audience’s reaction to their movies. Caltech and Disney Research makes use of a facial expression tracking neural network to learn and predict how different people in the audience react.
The research project was presented at IEEE’s Computer Vision and Pattern Recognition conference in Hawaii, and it demonstrated a new method on how facial expressions in a theater can be reliably tracked in real time.
The system made use of a factorized variational autoencoder, which is better than existing methods at capturing complex facial reactions in motion.
The researchers collected a set of face data by recording audiences of different people watching Disney movies. An infrared hi-definition camera captures the motions of everyone’s faces and they got around 16 million data points which was fed to the neural network. Once the neural network is finished with training, the set the system on watching audience footage in real time and tried to predict the expression a given face would make at different points. They found out that it took around 10 minutes to warm up the audience, and then it was able to predict laughs and smiles.
This is just an example of how this technology can be applied. Soon, it can be applied in other situations such as crowd monitoring or interpreting complex visual data, or even in the medical field.
According to Caltech’s Yisong Yue in a news release. “Understanding human behavior is fundamental to developing AI systems that exhibit greater behavioral and social intelligence. For example, developing AI systems to assist in monitoring and caring for the elderly relies on being able to pick up cues from their body language. After all, people don’t always explicitly say that they are unhappy or have some problem.”