Name: KLASIFIKASI EMOSI MANUSIA BERDASARKAN SUARA MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORK DAN MULTILAYER PERCEPTRON
Author: MUJI ERNAWATI

KLASIFIKASI EMOSI MANUSIA BERDASARKAN SUARA MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORK DAN MULTILAYER PERCEPTRON

MUJI ERNAWATI
14210225

ABSTRAK

ABSTRAK

Nama : Muji Ernawati

NIM : 14210225

Program Studi : Ilmu Komputer

Fakultas : Teknologi Informasi

Jenjang : Strata Dua (S2)

Konsentrasi : Data Mining

Judul Tesis : “Klasifikasi Emosi Manusia Berdasarkan Suara Menggunakan Convolutional Neural Network dan Multilayer Perceptron"

Emosi dalam ucapan dianggap sebagai prinsip dasar interaksi manusia dan memainkan peran penting dalam pengambilan keputusan, pembelajaran, dan komunikasi sehari-hari. Penelitian mengenai pengenalan emosi ucapan masih terus dilakukan oleh banyak peneliti untuk mengembangkan model pengenalan emosi ucapan dengan kinerja yang lebih baik. Pada penelitian ini, menggabungkan penerapan teknik augmentasi data (Add Noise, Time Stretch dan Pitch Shift) untuk meningkatkan ukuran data dari Javanese Speech Emotion Database ( Java-SED). Mel Frequency Cepstral Coefficients (MFCC) digunakan sebagai ekstraksi fitur yang kemudian membangun model Convolutional Neural Network (CNN) dan menerapkan Multilayer Perceptron (MLP) untuk klasifikasi emosi manusia berdasarkan suara. Pada penelitian ini, menghasilkan 8 kali model eksperiman dengan kombinasi teknik augmentasi yang berbeda-beda. Dari hasil evaluasi yang telah dilakukan, algoritma CNN menghasilkan kinerja tertinggi dengan akurasi sebesar 96,43% recall 96,43%, precision 96,57%, F1-score 96,48% dan kappa sebesar 95,71% dengan menerapkan teknik Add Noise, Time Stretch dan Pitch Shift.

KATA KUNCI

Pengenalan Emosi Ucapan,Convolutional Neural Network

DAFTAR PUSTAKA

DAFTAR PUSTAKA

[1] N. Patel, S. Patel, and S. H. Mankad, “Impact of autoencoder based compact representation on emotion detection from audio,” J. Ambient Intell. Humaniz. Comput., vol. 13, no. 2, pp. 867–885, 2021, doi: 10.1007/s12652-021-02979-3.

[2] H. Ibrahim, C. K. Loo, and F. Alnajjar, “Bidirectional parallel echo state network for speech emotion recognition,” Neural Comput. Appl., vol. 34, no. 20, pp. 17581–17599, 2022, doi: 10.1007/s00521-022-07410-2.

[3] D. Li, Y. Zhou, Z. Wang, and D. Gao, “Exploiting the potentialities of features for speech emotion recognition,” Inf. Sci. (Ny)., vol. 548, pp. 328– 343, 2020, doi: 10.1016/j.ins.2020.09.047.

[4] I. OZER, “Pseudo-colored rate map representation for speech emotion recognition,” Biomed. Signal Process. Control, vol. 66, no. February, p. 102502, 2021, doi: 10.1016/j.bspc.2021.102502.

[5] F. Andayani, L. B. Theng, M. T. Tsun, and C. Chua, “Hybrid LSTMTransformer Model for Emotion Recognition From Speech Audio Files,” IEEE Access, vol. 10, pp. 36018–36027, 2022, doi: 10.1109/ACCESS.2022.3163856.

[6] D. Issa, M. Fatih Demirci, and A. Yazici, “Speech emotion recognition with deep convolutional neural networks,” Biomed. Signal Process. Control, vol. 59, pp. 1–11, 2020, doi: 10.1016/j.bspc.2020.101894.

[7] S. M. Hoseini, “Persian Speech Emotion Recognition Approach based on Multilayer Perceptron,” Int. J. Digit. Content Manag., vol. 2, no. 3, pp. 177–187, 2021, [Online]. Available: https://dcm.atu.ac.ir/article_13682.html.

[8] A. Yuswanto and B. Wibowo, “Pembangunan Pusat Pengendalian Operasional Keamanan Informasi (Pusdalops Kami) guna Meningkatkan Pelayanan E-Gov dari Ancaman Kejahatan Siber,” Format J. Ilm. Tek. Inform., vol. 9, no. 2, p. 118, 2021, doi: 10.22441/format.2020.v9.i2.003. 75 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri

[9] R. H. Aljuhani, A. Alshutayri, and S. Alahdal, “Arabic Speech Emotion Recognition from Saudi Dialect Corpus,” IEEE Access, vol. 9, pp. 127081– 127085, 2021, doi: 10.1109/ACCESS.2021.3110992.

[10] R. Y. Rumagit, G. Alexander, and I. F. Saputra, “Model Comparison in Speech Emotion Recognition for Indonesian Language,” Procedia Comput. Sci., vol. 179, no. 2020, pp. 789–797, 2021, doi: 10.1016/j.procs.2021.01.098.

[11] A. A. Alnuaim et al., “Human-Computer Interaction for Recognizing Speech Emotions Using Multilayer Perceptron Classifier,” J. Healthc. Eng., vol. 2022, 2022, doi: 10.1155/2022/6005446.

[12] R. Jahangir, Y. W. Teh, G. Mujtaba, R. Alroobaea, Z. H. Shaikh, and I. Ali, “Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion,” Mach. Vis. Appl., vol. 33, no. 41, pp. 1–16, 2022, doi: 10.1007/s00138-022-01294-x.

[13] M. Deng, T. Meng, J. Cao, S. Wang, J. Zhang, and H. Fan, “Heart sound classification based on improved MFCC features and convolutional recurrent neural networks,” Neural Networks, vol. 130, pp. 22–32, 2020, doi: 10.1016/j.neunet.2020.06.015.

[14] S. Sultana, M. Z. Iqbal, M. R. Selim, M. Rashid, and M. S. Rahman, “Bangla Speech Emotion Recognition and Cross-Lingual Study Using Deep CNN and BLSTM Networks,” IEEE Access, vol. 10, pp. 564–578, 2022, doi: 10.1109/ACCESS.2021.3136251.

[15] Rendi Nurcahyo and Mohammad Iqbal, “Pengenalan Emosi Pembicara Menggunakan Convolutional Neural Networks,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 1, pp. 115–122, 2022, doi: 10.29207/resti.v6i1.3726.

[16] M. Rayhan Ahmed, S. Islam, A. K. M. Muzahidul Islam, and S. Shatabda, “An ensemble 1D-CNN-LSTM-GRU model with data augmentation for speech emotion recognition,” Expert Syst. Appl., vol. 218, no. November 2022, p. 119633, 2023, doi: 10.1016/j.eswa.2023.119633. 76 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri

[17] N. T. Pham et al., “Hybrid data augmentation and deep attention-based dilated convolutional-recurrent neural networks for speech emotion recognition,” Expert Syst. Appl., vol. 230, no. May, p. 120608, 2023, doi: 10.1016/j.eswa.2023.120608.

[18] F. Arifin, A. S. Priambodo, A. Nasuha, A. Winursito, and T. S. Gunawan, “Development of Javanese Speech Emotion Database ( Java-SED ),” Indones. J. Electr. Eng. Informatics, vol. 10, no. 3, pp. 584–591, 2022, doi: 10.52549/ijeei.v10i3.3888.

[19] J. Rintala, “Speech Emotion Recognition from Raw Audio using Deep Learning,” Royal Institute of Technology (KTH), 2020.

[20] P. Ekman, E. R. Sorenson, and W. V. Friesen, “Pan-Cultural Elements in Facial Displays of Emotion,” Science (80-. )., vol. 164, no. 3875, pp. 86– 88, 1969, doi: https://doi.org/10.1126/science.164.3875.86.

[21] E. Kalhor and B. Bakhtiari, “Speaker independent feature selection for speech emotion recognition: A multi-task approach,” Multimed. Tools Appl., vol. 80, no. 6, pp. 8127–8146, 2021, doi: 10.1007/s11042-020- 10119-w.

[22] M. I. Ashari, I. S. Faradisa, and M. Ardita, “Analisa Audio Stereo Encoder Untuk Pemancar Radio Siaran Fm,” Semin. Nas. Sains dan Teknol. Terap. III Inst. Teknol. Adhi Tama Surabaya, no. 978-602-98569-1–0, pp. 7–16, 2015.

[23] S. Wei, S. Zou, F. Liao, and W. Lang, “A Comparison on Data Augmentation Methods Based on Deep Learning for Audio Classification,” J. Phys. Conf. Ser., vol. 1453, no. 1, 2020, doi: 10.1088/1742- 6596/1453/1/012085.

[24] A. A. C. Alves et al., “Integrating Audio Signal Processing and Deep Learning Algorithms for Gait Pattern Classification in Brazilian Gaited Horses,” Front. Anim. Sci., vol. 2, no. August, pp. 1–19, 2021, doi: 10.3389/fanim.2021.681557.

[25] N. Zerari, S. Abdelhamid, H. Bouzgou, and C. Raymond, “Bidirectional deep architecture for Arabic speech recognition,” Open Comput. Sci., vol. 9, no. 1, pp. 99–102, 2019, doi: 10.1515/comp-2019-0004. 77 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri

[26] Y. R. Prayogi, “MODIFIKASI METODE EKSTRAKSI FITUR MEL FREQUENCY CEPSTRAL COEFFICIENT UNTUK IDENTIFIKASI PEMBICARA PADA LINGKUNGAN BERDERAU MENGGUNAKAN RESIDU ENDPOINT DETECTION,” Institut Teknologi Sepuluh Nopember, 2015.

[27] M. D. Pawar and R. D. Kokate, “Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients,” Multimed. Tools Appl., pp. 15563–15587, 2021, doi: https://doi.org/10.1007/s11042-020-10329-2.

[28] W. Setiawan, Deep Learning menggunakan Convolutional Neural Network?: Teori dan Aplikasi, I. Media Nusa Creative, 2021.

[29] J. Liu, W. Han, H. Ruan, X. Chen, D. Jiang, and H. LI, “Learning Salient Features for Speech Emotion Recognition Using CNN,” in 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia), 2018, pp. 1–5, doi: 10.1109/ACIIAsia.2018.8470393.

[30] M. Gao, Q. Zhang, J. Dong, D. Yang, and D. Zhou, “End-to-end speech emotion recognition based on one-dimensional convolutional neural network,” ACM Int. Conf. Proceeding Ser., vol. Part F1481, pp. 78–82, 2019, doi: 10.1145/3319921.3319963.

[31] I. The MathWorks, “Convolutional Neural Network,” Mathworks.com. https://www.mathworks.com/discovery/convolutional-neural-networkmatlab.html?s_tid=srchtitle_convolutional_2 (accessed Dec. 23, 2022).

[32] H. Alsayadi, A. Abdelhamid, I. Hegazy, and Z. Taha, “Data Augmentation for Arabic Speech Recognition Based on End-to-End Deep Learning,” Int. J. Intell. Comput. Inf. Sci., vol. 21, no. 2, pp. 50–64, 2021, doi: 10.21608/ijicis.2021.73581.1086.

[33] B. T. Atmaja, A. Sasou, and M. Akagi, “Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion,” Speech Commun., vol. 140, no. November 2021, pp. 11–28, 2022, doi: 10.1016/j.specom.2022.03.002. 78 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri

[34] H. Taud and J. F. Mas, “Multilayer Perceptron (MLP),” in Geomatic Approaches for Modeling Land Change Scenarios, W. Cartwright, G. Gartner, L. Meng, and M. P. Peterson, Eds. Springer International Publishing, 2018, pp. 451–455.

[35] W. Setiawan, TOPIK KHUSUS KECERDASAN KOMPUTASIONAL?: Deep learning Untuk Image dan Speech Recognition, I. Malang: Media Nusa Creative (MNC Publishing), 2020.

[36] A. F. Agarap, “Deep Learning using Rectified Linear Units (ReLU),” Arxiv, no. 1, pp. 2–8, 2018, doi: https://doi.org/10.48550/arXiv.1803.08375.

[37] J. G. Carney and P. Cunningham, “The Epoch Interpretation of Learning,” p. 5, 1998, [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.5940&rep=re p1&type=pdf.

[38] N. S. Keskar, J. Nocedal, P. T. P. Tang, D. Mudigere, and M. Smelyanskiy, “On large-batch training for deep learning: Generalization gap and sharp minima,” in 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, 2017, pp. 1–16.

[39] J. Brownlee, “Understand the Impact of Learning Rate on Neural Network Performance,” Machine Learning Mastery, 2019. https://machinelearningmastery.com/understand-the-dynamics-of-learningrate-on-deep-learning-neural-networks/ (accessed Aug. 05, 2023).

[40] M. Grandini, E. Bagli, and G. Visani, METRICS FOR MULTI-CLASS CLASSIFICATION: AN OVERVIEW. arXiv, 2020.

[41] S. K. Challa, A. Kumar, and V. B. Semwal, “A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data,” Vis. Comput., no. 0123456789, 2021, doi: 10.1007/s00371-021-02283-3.

[42] B. T. Atmaja and A. Sasou, “Effects of Data Augmentations on Speech Emotion Recognition,” MDPI, vol. 22, pp. 1–14, 2022, doi: https://doi.org/10.3390/s22165941. 79 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri

[43] A. A. Abdelhamid et al., “Robust Speech Emotion Recognition Using CNN+LSTM Based on Stochastic Fractal Search Optimization Algorithm,” IEEE Access, vol. 10, pp. 49265–49284, 2022, doi: 10.1109/ACCESS.2022.3172954.

[44] A. C. Shruti, R. H. Rifat, M. Kamal, and M. G. R. Alam, “A Comparative Study on Bengali Speech Sentiment Analysis Based on Audio Data,” in 2023 IEEE International Conference on Big Data and Smart Computing, BigComp 2023, 2023, pp. 219–226, doi: 10.1109/BigComp57234.2023.00043.

Detail Informasi

Tesis ini ditulis oleh :

Nama : MUJI ERNAWATI
NIM : 14210225
Prodi : Ilmu Komputer
Kampus : Margonda
Tahun : 2023
Periode : I
Pembimbing : Prof. Ir. Dr. Dwiza Riana, S,Si, MM, M.Kom
Asisten :
Kode : 0030.S2.IK.TESIS.I.2023
Diinput oleh : NZH
Terakhir update : 11 Juni 2024
Dilihat : 163 kali

TENTANG PERPUSTAKAAN

E-Library Perpustakaan Universitas Nusa Mandiri merupakan platform digital yang menyedikan akses informasi di lingkungan kampus Universitas Nusa Mandiri seperti akses koleksi buku, jurnal, e-book dan sebagainya.