Facial expression classification for deep pain analysis

Young woman sitting in a cafe with her laptop, Stressful for wor
Image Credit: Freepik

Pain is a subjective emotion. It is by all understood that pain won’t be suffered or expressed in the same way by people from different backgrounds, professions or nationalities. How to measure pain is still today a question in full discussion among doctors and sanitary professionals.

Methods can be invasive, as is brain screening, or non invasive, by asking the patient how much pain they are feeling (via questionnaire) or, the one concerning our area, using computer vision to analyse facial expression and thus infer pain. Researchers from CVC at Universitat Autònoma de Barcelona and Aalborg University in Denmark have focused on the latter. Working together for more than one year, they have obtained a remarkable accuracy as stated on their joint paper ‘Deep Pain: exploiting long short-tem memory Networks for facial expression classification’ published in the IEEE Transactions on Cybernetics journal.

We propose an automatic model to detect pain from facial recognition”, states Pau Rodríguez first author of the paper, CVC PhD student member of the Image Sequence Evaluation (ISE) Lab. “We can therefore predict pain in real time. We’ve used two deep learning models for this: the first one extracts facial features and the second one, which is a recurrent deep learning model, learns the evolution in time of the frame-wise characteristics in order to predict that person’s expressions and thus be much more precise at the facial recognition of pain”.

Pain measurement in context

The measurement of pain has been an unresolved issue among doctors for years. Journalist John Walsh, at a fantastic article published at The Independent, makes a thorough analysis on how pain is measured. According to Walsh, one of the most used techniques is the McGill Pain Questionnaire, developed by Dr. Ronald Melzack and Dr. Warren Togerson (Montreal University) in the 1970’s.  The Questionnaire resumed the words patients had used to describe pain, classifying them into three categories: sensory (which included heat, pressure, “throbbing” or “pounding” sensations), affective (which related to emotional effects, such as “tiring”, “sickening”, “gruelling” or “frightful”) and evaluative (evocative of an experience – from “annoying” and “troublesome” to “horrible”, “unbearable” and “excruciating”).

Although the classification makes sense, as Walsh correctly points out, words can be overlapped and are easily interchangeable. It is, no doubtedly a good attempt of escaping subjectivity, but without actually achieving it. The intensity of each word will still vary among different people.

Within this line, the USA national initiative on pain control created another questionnaire, the Pain Quality Assessement (PQAS). In this occasion, patients “were asked to indicate, on a scale of 1 to 10, how “intense” – or “sharp”, “hot”, “dull”, “cold”, “sensitive”, “tender”, “itchy”, etc – their pain has been over the past week”. But, of course, the scale would only be based on the experience of each individual. Again, it would not be the same the experience of a person that has been through much pain (a soldier, for example) as that to a person who has never been seriously injured.

In the quest of an objective method concerning pain, neurosciences also had its own say. Professor Irene Tracey, head of the University of Oxford’s Nuffield Department of Clinical Neurosciences has widely studied the brain’s response to pain. Dr. Tracey uses brain imaging and the notions of the brain’s interconnections to identify pain within the cortex. “A most objective method indeed, but highly invasive” as stated by Guillem Cucurull, PhD student of the ISE Lab at the Computer Vision Center. What this group is investigating opens a window of opportunities in the monitoring of patients in intensive care units; cheaper and much less invasive. All you need is a camera.

Pain recognition by facial analysis

The result’s achieved by Pau Rodríguez and Guillem Cucurull’s team are undoubtedly good, with an astonishing accuracy over a standard dataset commonly used by Computer vision scientists. Nevertheless, Pau Rodríguez is cautious “they are good under controlled circumstances and with this particular dataset. We don’t know how it would work with children, or people with dementia”.  The network was trained with a certain dataset and then tested with one it had not seen before thus obtaining the 97.2% accuracy they present in their paper.

The automatic model they propose predicts pain in real time and can learn on the way, increasing reliability and accuracy. They have also realised that CNNs (convolutional neural networks) perform better with less processed images, at least within this research, and avoiding facial action units (groups of muscles) which have been typically used to encode facial motion, but which authors have avoided in this research giving the neural network space for inferring the level of pain in its own learning synergy.

We measure pain in a scale from 0 to 15, where any number above 0 is interesting for us, as it is already pain”. Dataset images have been previously annotated by experienced annotators, specialists who can tell apart people with pain to people with no pain, giving each picture a number between 0 and 10. “This is how our model learns”, as explained by Pau Rodríguez, “we show the neural network a picture, with the number associated to it. The neural network then infers the features which correspond to that level of pain, acknowledging the common features and patterns and becoming a true expert in pain detection”.

The article “Deep Pain: Exploiting Long Short-Term Memory Networks for facial expression classification” can be accessed here.

Image Credit: Business photography created by Jcomp –

This work has been possible thanks to the support of the Spanish project TIN2015-65464-R (MINECO/FEDER), the 2016FI\_B 01163 grant by the CERCA
Programme/Generalitat de Catalunya, and the COST Actions IC1106 (Integrating Biometrics and Forensics for the Digital Age) and IC1307 iV\&L Net (European Network on Integrating Vision and Language), both supported by COST (European Cooperation in Science and Technology).

Carlos Sierra

The author Carlos Sierra