Name: EXPLAINABLE MACHINE LEARNING UNTUK SOFTWARE EFFORT ESTIMATION
Author: LAMRIA SIMATUPANG

EXPLAINABLE MACHINE LEARNING UNTUK SOFTWARE EFFORT ESTIMATION

LAMRIA SIMATUPANG
14207039

ABSTRAK

ABSTRAK Nama : Lamria Simatupang NIM : 14207039 Program Studi : Ilmu Komputer Jenjang : Strata Dua (S2) Konsentrasi : Software Engineering Judul Tesis : “Explainable Machine Learning untuk Software Effort
Estimation” Estimasi upaya pengembangan perangkat lunak merupakan aktivitas terpenting dalam manajemen proyek mencakup biaya, tenaga dan waktu. Estimasi yang dilakukan pada tahap awal harus dilakukan dengan sangat presisi. Untuk memprediksi estimasi upaya perangkat lunak, banyak model machine learning yang telah digunakan, namun belum ada metode tunggal yang terbukti stabil pada semua kasus serta belum ada model dalam estimasi upaya perangkat lunak yang
explainable. Penelitian ini bertujuan untuk meningkatkan akurasi prediksi estimasi upaya perangkat lunak dengan metode random search pada algoritma boosting
regressor dan melakukan fitur analisis menggunakan metode SHapley Additive
exPlanations (SHAP). Performa terbaik dihasilkan oleh Gradient Boosting
Regressor pada dataset China, Albrecht dan Desharnais, sedangkan pada dataset ISBSG dihasilkan oleh Light Gradient Boosting Regressor. Metode yang diusulkan dapat meningkatkan akurasi metode non ensemble dalam estimasi upaya perangkat lunak dan metode SHAP memberikan visualisasi yang jelas dan mudah dipahami tentang pengaruh masing-masing fitur pada estimasi upaya. Kata kunci: Estimasi upaya perangkat lunak, boosting regressor, randomsearch,
Explainable, SHAP

KATA KUNCI

EXPLAINABLE MACHINE LEARNING,Software Effort Estimation

DAFTAR PUSTAKA

DAFTAR PUSTAKA [1] M. A. Shah, D. N. A. Jawawi, M. A. Isa, M. Younas, A. Abdelmaboud, and F. Sholichin, “Ensembling Artificial Bee Colony with Analogy- Based Estimation to Improve Software Development Effort Prediction,” IEEE Access, vol. 8, no. April, pp. 58402–58415, 2020, doi: 10.1109/ACCESS.2020.2980236. [2] Y. Mahmood and M. Ali, “Improving Estimation Accuracy Prediction of Software Development Effort?: A Proposed Ensemble Model,” no. June, pp. 12–13, 2020. [3] B. Practice, Software Project Effort Estimation, Foundatios and Best
Practice Guidelines for Succes. [4] S. Shukla, S. Kumar, and P. R. Bal, “Analyzing effect of ensemble models on multi-layer perceptron network for software effort estimation,” Proceedings - 2019 IEEE World Congress on Services,
SERVICES 2019, pp. 386–387, 2019, doi: 10.1109/SERVICES.2019.00116. [5] R. R. Sinha and R. K. Gora, “Software effort estimation using machine learning techniques,” Lecture Notes in Networks and Systems, vol. 135, pp. 65–79, 2021, doi: 10.1007/978-981-15-5421-6_8. [6] P. Suresh Kumar, H. S. Behera, J. Nayak, and B. Naik, “A pragmatic ensemble learning approach for effective software effort estimation,”
Innov Syst Softw Eng, 2021, doi: 10.1007/s11334-020-00379-y. [7] A. Purwanto and L. Parningotan Manik, “Software Effort Estimation Using Logarithmic Fuzzy Preference Programming and Least Squares Support Vector Machines,” Scientific Journal of Informatics, vol. 10, no. 1, 2023, doi: 10.15294/sji.v10i1.39865. [8] Y. Mahmood, N. Kama, A. Azmi, and M. Ali, “Improving Estimation Accuracy Prediction of Software Development Effort: A Proposed Ensemble Model,” 2nd International Conference on Electrical,
Communication and Computer Engineering, ICECCE 2020, no. June, pp. 12–13, 2020, doi: 10.1109/ICECCE49384.2020.9179279. [9] F. Arslan, “Review of Computer Engineering Research Keyword s,” vol. 6, no. 2, pp. 64–75, 2019, doi: 10.18488/journal.76.2019.62.64.75. [10] Y. Mahmood, N. Kama, A. Azmi, A. S. Khan, and M. Ali, “Software effort estimation accuracy prediction of machine learning techniques: A systematic performance evaluation,” Softw Pract Exp, vol. 52, no. 1, pp. 39–65, 2022, doi: 10.1002/spe.3009.
81
Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri [11] L. P. Manik, “FEATURES ENGINEERING WITH GENERATING GENETIC ALGORITHM FOR SOFTWARE EFFORT ESTIMATION,” ICIC Express Letters, Part B: Applications, vol. 13, no. 1, Jan. 2022, doi: 10.24507/icicelb.13.01.1. [12] A. Zakrani, A. Idri, and M. Hain, Software Effort Estimation Using an
Optimal Trees Ensemble?: An Empirical Comparative Study, vol. 1. Springer International Publishing, 2020. doi: 10.1007/978-3-030- 21005-2. [13] M. Z. M. Hazil, M. N. Mahdi, M. S. Mohd Azmi, L. K. Cheng, A. Yusof, and A. R. Ahmad, “Software Project Management Using Machine Learning Technique - A Review,” 2020 8th International
Conference on Information Technology and Multimedia, ICIMU 2020, pp. 363–370, 2020, doi: 10.1109/ICIMU49871.2020.9243543. [14] K. E. Rao and G. A. Rao, “Ensemble learning with recursive feature elimination integrated software effort estimation: a novel approach,”
Evol Intell, vol. 14, no. 1, pp. 151 –162, Mar. 2021, doi: 10.1007/s12065-020-00360-5. [15] F. Hutter, Automated Machine Learning, Methods, Systems,
Challenges. [16] S. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” May 2017, [Online]. Available: http://arxiv.org/abs/1705.07874 [17] X. Zhong, “Explainable machine learning in materials science”. [18] Z. Li, “Extracting spatial effects from machine learning model using local interpretation method: An example of SHAP and XGBoost,”
Comput Environ Urban Syst, vol. 96, Sep. 2022, doi: 10.1016/j.compenvurbsys.2022.101845. [19] S. Chehreh Chelgani, H. Nasiri, and M. Alidokht, “Interpretable modeling of metallurgical responses for an industrial coal column flotation circuit by XGBoost and SHAP-A ‘conscious-lab’ development,” Int J Min Sci Technol, vol. 31, no. 6, pp. 1135–1144, Nov. 2021, doi: 10.1016/j.ijmst.2021.10.006. [20] Y. Nohara, T. Inoguchi, C. Nojiri, and N. Nakashima, “Explanation of Machine Learning Models of Colon Cancer Using SHAP Considering Interaction Effects,” Aug. 2022, [Online]. Available: http://arxiv.org/abs/2208.03112 [21] S. ben Jabeur, S. Mefteh-Wali, and J. L. Viviani, “Forecasting gold price with the XGBoost algorithm and SHAP interaction values,” Ann
Oper Res, 2021, doi: 10.1007/s10479-021-04187-w.
82
Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri [22] S. Yang, J. Wu, Y. Du, Y. He, and X. Chen, “Ensemble Learning for Short-Term Traffic Prediction Based on Gradient Boosting Machine,”
J Sens, vol. 2017, 2017, doi: 10.1155/2017/7074143. [23] I. K. Nti, A. F. Adekoya, and B. A. Weyori, “A comprehensive evaluation of ensemble learning for stock-market prediction,” J Big
Data, vol. 7, no. 1, Dec. 2020, doi: 10.1186/s40537-020-00299-5. [24] G. Wang, J. Hao, J. Ma, and H. Jiang, “A comparative assessment of ensemble learning for credit scoring,” Expert Syst Appl, vol. 38, no. 1, pp. 223–230, 2011, doi: 10.1016/j.eswa.2010.06.048. [25] “Machine Learning Fundamentals.” [26] C. Rani Panigrahi, B. Pati, B. Kumar Pattanayak, S. Amic, and K.-C. Li Editors, “Advances in Intelligent Systems and Computing 1299.” [Online]. Available: http://www.springer.com/series/11156 [27] D. A. Otchere, T. O. A. Ganat, J. O. Ojero, B. N. Tackie-Otoo, and M. Y. Taki, “Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions,” J Pet Sci Eng, vol. 208, Jan. 2022, doi: 10.1016/j.petrol.2021.109244. [28] S. Elango et al., “Extreme Gradient Boosting Regressor Solution for Defy in Drilling of Materials,” Advances in Materials Science and
Engineering, vol. 2022, 2022, doi: 10.1155/2022/8330144. [29] M. Rzycho?, A. ?oga?a, and L. Róg, “Experimental study and extreme gradient boosting (XGBoost) based prediction of caking ability of coal blends,” J Anal Appl Pyrolysis, vol. 156, Jun. 2021, doi: 10.1016/j.jaap.2021.105020. [30] T. Chen et al., “Prediction of Extubation Failure for Intensive Care Unit Patients Using Light Gradient Boosting Machine,” IEEE Access, vol. 7, pp. 150960–150968, 2019, doi: 10.1109/ACCESS.2019.2946980. [31] V. Jiajabai, “Life Prediction of Bearing by using Adaboost Regressor Sangram Patil, Aum Patil, Vikas Phalle,” 2018. [Online]. Available: https://ssrn.com/abstract=3398399 [32] S. Kumar and P. R. Venkatesan, “Hyperparameters tuning of ensemble model for software effort estimation,” J Ambient Intell Humaniz
Comput, no. 0123456789, 2020, doi: 10.1007/s12652-020-02277-4. [33] L. Villalobos-Arias, C. Quesada-López, J. Guevara-Coto, A. Martínez, and M. Jenkins, “Evaluating hyper-parameter tuning using random search in support vector machines for software effort estimation,”
PROMISE 2020 - Proceedings of the 16th ACM International
Conference on Predictive Models and Data Analytics in Software
83
Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri
Engineering, Co-located with ESEC/FSE 2020, pp. 31–40, 2020, doi: 10.1145/3416508.3417121. [34] “CNN Hyperparameter Optimization using Random Grid Coarse-tofine Search for Face Classification”. [35] T. Tran, V. Nguyen, T. Truong, C. Tran, and P. Le, “An evaluation of parameter pruning approaches for software estimation,” in ACM
International Conference Proceeding Series, Sep. 2019, pp. 26–35. doi: 10.1145/3345629.3345633. [36] A. Talpur, “Congestion Detection in Software Defined Networks using Machine Learning of Ali Murad Talpur,” 2017, doi: 10.13140/RG.2.2.14985.85600. [37] J. Cai, K. Xu, Y. Zhu, F. Hu, and L. Li, “Prediction and analysis of net ecosystem carbon exchange based on gradient boosting regression and random forest,” Appl Energy, vol. 262, Mar. 2020, doi: 10.1016/j.apenergy.2020.114566. [38] Z. C. Lipton, “The Mythos of Model Interpretability in Machine Learning, The Concept of Interpretability is Both Important and Slippery.” [39] V. Belle and L. Papantonis, “Principles and Practice of Explainable Machine Learning”. [40] P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable ai: A review of machine learning interpretability methods,” Entropy, vol. 23, no. 1. MDPI AG, pp. 1 –45, Jan. 01, 2021. doi: 10.3390/e23010018. [41] S. Mane and D. Rao, “Explaining Network Intrusion Detection System Using Explainable AI Framework.” [42] M. Bugaj, K. Wrobel, and J. Iwaniec, “Model Explainability using SHAP Values for LightGBM Predictions,” in International Conference
on Perspective Technologies and Methods in MEMS Design, May 2021, vol. 2021 -May, pp. 102–106. doi: 10.1109/MEMSTECH53091.2021.9468078. [43] “From Local Explanation to Global Understanding with Explainable AI for Trees”. [44] R. Marco, S. Sharifah, S. Ahmad, and S. Ahmad, “Bayesian Hyperparameter Optimization and Ensemble Learning for Machine Learning Models on Software Effort Estimation.” [Online]. Available: www.ijacsa.thesai.org
84
Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri [45] S. Shukla, S. Kumar, and P. R. Bal, “Analyzing effect of ensemble models on multi-layer perceptron network for software effort estimation,” Proceedings - 2019 IEEE World Congress on Services,
SERVICES 2019, vol. 2642–939X, pp. 386–387, 2019, doi: 10.1109/SERVICES.2019.00116. [46] K. Charmanas, N. Mittas, and L. Angelis, “Ensemble Software Development Effort Estimation Using Data Envelopment Analysis,”
ACM International Conference Proceeding Series, pp. 202–207, 2020, doi: 10.1145/3437120.3437307. [47] A. G. Priya Varshini, K. Anitha Kumari, and V. Varadarajan, “Estimating software development efforts using a random forest-based stacked ensemble approach,” Electronics (Switzerland), vol. 10, no. 10, pp. 1–19, 2021, doi: 10.3390/electronics10101195. [48] M. A. Shah, D. N. A. Jawawi, M. A. Isa, M. Younas, A. Abdelmaboud, and F. Sholichin, “Ensembling Artificial Bee Colony with Analogy- Based Estimation to Improve Software Development Effort Prediction,” IEEE Access, vol. 8, pp. 58402–58415, 2020, doi: 10.1109/ACCESS.2020.2980236. [49] O. Malgonde and K. Chari, An ensemble-based model for predicting
agile software development effort, vol. 24, no. 2. Empirical Software Engineering, 2019. doi: 10.1007/s10664-018-9647-0. [50] O. H. Alhazmi and M. Z. Khan, “Software Effort Prediction Using Ensemble Learning Methods,” Journal of Software Engineering and
Applications, vol. 13, no. 07, pp. 143–160, 2020, doi: 10.4236/jsea.2020.137010. [51] S. Shukla, S. Kumar, and P. R. Bal, “Analyzing effect of ensemble models on multi-layer perceptron network for software effort estimation,” Proceedings - 2019 IEEE World Congress on Services,
SERVICES 2019, vol. 2642–939X, pp. 386–387, 2019, doi: 10.1109/SERVICES.2019.00116. [52] P. Pospieszny, B. Czarnacka-Chrobot, and A. Kobylinski, “An effective approach for software project effort and duration estimation with machine learning algorithms,” Journal of Systems and Software, vol. 137, pp. 184–196, 2018, doi: 10.1016/j.jss.2017.11.066. [53] S. Masripah, “Komparasi Algoritma Klasifikasi Data Mining untuk Evaluasi Pemberian Kredit,” Bina Insani ICT Journal, vol. 3, no. 1, p. 234336, 2016. [54] M. F. Bosu and S. G. Macdonell, “Experience: Quality benchmarking of datasets used in software effort estimation,” Journal of Data and
Information Quality, vol. 11, no. 4, Aug. 2019, doi: 10.1145/3328746.
85
Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri [55] K. Koskinen, “AI-assisted Software Development Effort Estimation,” 2021. [56] “Analisis perbandingan metode imputasi missing values global dan concept method pada data supervised”.

Detail Informasi

Tesis ini ditulis oleh :

Nama : LAMRIA SIMATUPANG
NIM : 14207039
Prodi : Ilmu Komputer
Kampus : Margonda
Tahun : 2022
Periode : II
Pembimbing : Dr. Agus Subekti, M.T
Asisten :
Kode : 0054.S2.IK.TESIS.II.2022
Diinput oleh : RKY
Terakhir update : 04 Agustus 2023
Dilihat : 120 kali

TENTANG PERPUSTAKAAN

E-Library Perpustakaan Universitas Nusa Mandiri merupakan platform digital yang menyedikan akses informasi di lingkungan kampus Universitas Nusa Mandiri seperti akses koleksi buku, jurnal, e-book dan sebagainya.