Ong Kah Liang, Dr. Lee Chin Poo, Prof. Lim Heng Siong, Dr. Lim Kian Ming, Takeki Mukaida
Description of Invention
In this research, data augmentation techniques are used to improve the Machine Learning model learning ability. These augmented audio samples are then encoded and transformed into several discriminative frequency and temporal domain features for model training. For the classification, a light gradient boosting machine (LightGBM) is employed to minimize the computational costs which provides similar performance, trains faster and requires less computational power than deep learning approaches. The proposed LightGBM is pre-trained on large image datasets and fine-tuned to optimal settings. Based on the performance, the proposed LightGBM is able to deploy for basic level of emotion recognition.