AI Modeling — Avi Shukla

AI Modeling for Emotion Detection

There are multiple AI algorithms to do facial recognition. Generally, these algorithms are divided into two main approaches. The geometric approach focuses on distinguishing features, and the photo-metric statistical methods extract values from an image. These values are then compared to templates to eliminate variances.

These algorithms can be extended to recognize the emotions in facial expressions and hence emotion detection. All of these algorithmic models depend on the training images and then testing images. The preprocessing of the images has a significant impact on emotion detection, and we are going to look at the different aspect of the preprocessing on improving the emotion detection.

The main types of models for Facial Recognition and emotions used are:

Euclidean Distance Modeling

There are multiple AI algorithms to do facial recognition. Generally, these algorithms are divided into two main approaches. The geometric approach focuses on distinguishing features, and the photo-metric statistical methods extract values from an image. These values are then compared to templates to eliminate variances.

Multilayer Perceptron (MLP) Modeling

The MLP consists of three or more layers (an input and an output layer with one or more hidden layers) of nonlinearly activating nodes. Since MLPs are fully connected, each node in one layer connects with a certain weight to every node in the following layer. In this algorithm, outfitting is a big problem, which is avoided by using dropout to prevent any one neuron form from being heavily relied upon.

Convolutional neural network (CNN) Modeling

CNN is a standard neural network with new layers, convolutional and pooling. CNN can have dozens of these layers, each of which learns to detect different imaging features. CNN takes a different approach towards regularization: they take advantage of the hierarchical pattern in data and assemble increasingly complex patterns using smaller and simpler patterns embossed in their filters. Therefore, on a scale of connectivity and complexity, CNNs are on the lower extreme, making them perfect for using many more layers than any MLP modeling. The biological process inspires CNN that the connectivity pattern between neurons resembles the organization of the visual cortex.

Transfer learning (TL) modeling

It focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. Your model and general model must be similar for it to work effectively. It is very well suited for facial recognition. But it needs a very well-trained model for facial expression to detect emotions as people express their emotions differently.

Material and Method Used

I decided to use the FER2013 dataset, which contains approximately 30,000 facial RGB images of different expressions with a size restricted to 48×48, and the main labels of it can be divided into 7 types: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral. The Disgust expression a minimal number of images – 600, while other labels have nearly 5,000 samples each.

The reason for taking this dataset is to have standardized emotion detection for a large subset of the minimal size of photographs representing real word applications. Taking better photos would expect better recognition; the purpose was to comparative study.

Data set information

I used google colab (code at kaggle.com) to do the development, and standard public domain code has been used whenever possible.

We used Dlib library to recognize facial landmarks and used the distance between the landmarks to recognize emotions.

Data with different preprocessing to be continued ….