A team from MIT Lincoln Laboratory's Bioengineering Systems and Technologies Group was named a first-place subchallenge winner at the 2014 Audio/Visual Emotion Challenge and Workshop (AVEC 2014), the fourth annual competition that invites participants to use multimedia processing and machine learning to analyze subjects’ emotional states or estimate subjects’ level of depression.
Held at the annual Association for Computing Machinery (ACM) International Conference on Multimedia, the challenge gauges the success of entrants’ approaches to automated emotion detection on a set of common benchmarks. In 2014, two subchallenges were presented: continuously distinguishing emotions and estimating the level of subjects’ depression from audio and visual data.
Of the 14 groups competing in the 2014 depression assessment subchallenge, Lincoln Laboratory's team was the most successful in predicting a depression score. Participants in this subchallenge estimate the severity of subjects' depression from either vocal characteristics detected in audio or facial signs identified in video recordings, or both. Because people with major depressive disorder often exhibit altered motor control that affects the mechanisms controlling speech production and facial expression, changes in motor outputs inferred from speech acoustics and facial movements may indicate depression. In the subchallenge, competitors' estimations are compared with previously determined self-reported assessment scores of the subjects' depressive severity; scores are based on the Beck Depression Inventory, an evaluation tool used widely by mental-health professionals and researchers. In 2013, the Laboratory's team also took first place in the AVEC depression assessment challenge.
"This year we used both speech and facial expression to determine depression levels," says Thomas Quatieri, a senior technical staff member on the AVEC team. "To exploit speech data, our team used novel biomarkers based on phoneme-dependent speaking rate and timing, and on incoordination of vocal tract articulators, as we did in AVEC 2013. In addition, we introduced vocal features that reflect the timing and coordination between articulators and the speech-production source at the vocal folds. In 2014, we also introduced biomarkers based on the timing and coordination of facial features that reflect muscle groups underlying facial expression during speech production. Our vocal and facial biomarkers together formed the basis for predicting depression scores."
The suite of algorithms that Lincoln Laboratory researchers used to predict Beck Depression Inventory ratings combines complementary features in Gaussian-mixture model and extreme learning-machine classifiers. "We were given training data (with known Beck scores) from which to build a prediction model," says Quatieri. "At the challenge, we used this model with new test data to demonstrate our technique's capability in predicting Beck scores. Although the speech samples were in German and our biomarkers were designed with English, the biomarkers were applied effectively, indicating an independence across some languages." Quatieri noted that this year's challenge was more difficult than the one in 2013 because much less training data were provided.
"It was exciting to extend our previous voice-only feature approaches used in AVEC 2013 to analyzing facial dynamics from video. This extension was accomplished by extracting similar signatures of depression that were based on characterizations of multivariate timing and coordination," says James Williamson, a technical staff member also on the AVEC team.
Quatieri and Williamson, along with colleagues Brian Helfer and Gregory Ciccarelli from the Bioengineering Systems and Technologies Group and consultant Daryush Mehta of Massachusetts General Hospital, helped develop the technology that led to the 2014 AVEC win. In addition, Lincoln Laboratory staff members Bea Yu and Rachelle Horwitz-Martin contributed to the earlier AVEC 2013 win.
The team's recent work may lead to research on features for depression assessment based on other cross modalities involving muscle coordination and timing, such as coordination between articulators and facial-muscle activation. Lincoln Laboratory's biomarker technology, which has shown good results in predicting an individual's cognitive state, is also being explored for use in evaluating the severity of other neurological disorders, such as traumatic brain injury and dementia.
In addition, the team is collaborating with Satra Ghosh and John Gabrieli from the MIT Department of Brain and Cognitive Sciences to develop computational models of speech production in the disordered brain by merging knowledge of neurological disorders, computational modeling, and speech-signal processing. "We are successfully using the same principles of articulatory timing and coordination in other neurological disorders and thus feel we may have discovered a common vocal feature basis for neurocognitive decline," says Quatieri. "Our collaborative work with the MIT Department of Brain and Cognitive Sciences may provide a neural foundation for this hypothesis and lead to even more effective biomarkers."
Both the medical and the military communities are very interested in developing tools that quantitatively measure levels of depression and other neurological disorders. According to the National Institutes of Health, major depressive disorder strikes about 6.7 percent of U.S. adults each year. The U.S. Department of Veterans Affairs' National Center for Post-traumatic Stress Disorder (PTSD) estimates that PTSD will afflict about 7–8 percent of the U.S. population sometime during their lives; is affecting, in any given year, 11–20 percent of veterans of the various conflicts in the Middle East; and has been diagnosed in almost 30 percent of Vietnam War veterans. In a March 2014 issue, the Journal of American Medical Association (JAMA) Psychiatry reported on a study stating that almost 25 percent of 5,500 active-duty, nondeployed military personnel surveyed were assessed as having a mental disorder of some type.
Objective predictors of depression severity, such as the tools used in the AVEC challenge, could supplement the currently-used diagnostic techniques that rely on patients' self-reporting and clinicians' subjective assessments. In addition, because these tools exploit subtleties in speech and facial movement that may not be detected by clinicians' observations, they may enable earlier diagnoses of depression.
Helfer summed up the significance of the AVEC recognition: "This win really helps to validate our work and brings us one step closer to transitioning our findings to the general public."