Prediksi Cacat Perangkat Lunak Untuk Evaluasi Kualitas Menggunakan Teknik Pembelajaran Ensemble Stacking

  • MUHAMMAD ROMADHONA KUSUMA
  • 14210128

ABSTRAK

ABSTRAK

Nama : Muhamamd Romadhona Kusuma

NIM : 14210128

Program Studi : Ilmu Komputer

Fakultas : Teknologi Informasi

Jenjang : Strata Dua (S2)

Konsentrasi : Software Engineering

Judul : “Prediksi Cacat Perangkat Lunak Untuk Evaluasi Kualitas Menggunakan Teknik Pembelajaran Ensemble Stacking”

Penelitian ini bertujuan untuk meningkatkan kualitas perangkat lunak dan mendukung pengelolaan zakat yang lebih efektif oleh Badan Amil Zakat Nasional (BAZNAS) dengan membangun model prediksi kecacatan perangkat lunak (Software Defect Prediction Model - SDPM). Pada penelitian ini, digunakan teknik pembelajaran mesin dan pendekatan ensemble stacking pada dataset "Menara Masjid" yang terdiri dari 228 record dan 34 atribut. Proses preprocessing meliputi label encoding untuk mengubah data kategorikal ke numerikal, diikuti oleh fitur selection dengan pearson correlation dan select k-best untuk memilih atribut yang akan digunakan. Selanjutnya, diterapkan normalisasi standar dengan standar scaler dan teknik SMOTE untuk menangani ketidakseimbangan distribusi data. Proses hyperparameter tuning dengan grid search CV dilakukan pada algoritma Machine Learning seperti Ada Boost dan Gradient Boosting. Hasil penelitian menunjukkan bahwa pendekatan ensemble stacking dengan gabungan algoritma Gradient Boosting, Ada Boost, Decision Tree, dan Bayesian Ridge, serta meta learner LightGBM memberikan peningkatan akurasi dengan skor 0,97 R2 score, MAE 0,037, dan MSE 0,006. Hal ini menunjukkan bahwa pendekatan ensemble stacking mampu mengatasi permasalahan kecacatan perangkat lunak dengan hasil prediksi yang lebih akurat, dan dapat memberikan panduan dan kerangka kerja yang berguna dalam pengelolaan zakat dan aplikasi perangkat lunak lainnya. Hasil ini diharapkan dapat meningkatkan kualitas perangkat lunak dan mendukung Badan Amil Zakat Nasional (BAZNAS) dalam pengelolaan zakat dengan lebih efektif..

KATA KUNCI

Kecacatan perangkat lunak,Prediksi,Fitur Seleksi,SMOTE,Hyperparameter Tuning


DAFTAR PUSTAKA

DAFTAR PUSTAKA

[1] Yudha Yudhanto "Information Technology Business Start-up" ISBN 978- 602-04-8721-2, Penerbit PT Elex Media Komputindo 2018

[2] Kemenag " UNDANG-UNDANG REPUBLIK INDONESIA NOMOR 23 TAHUN 2011 TENTANG PENGELOLAAN ZAKAT" Hal 38 produkhukum.kemenag.go.id https://produkhukum.kemenag.go.id/downloads/ 142d58ec07846088ae1e8bae044640c5.pdf (accessed Jan. 4, 2023).

[3] BAZNAS "PERATURAN BADAN AMIL ZAKAT NASIONAL REPUBLIK INDONESIA NOMOR 4 TAHUN 2018 TENTANG PELAPORAN PELAKSANAAN PENGELOLAAN ZAKAT" https://pid.baznas.go.id/wp-content/uploads/2019/03/PERBAZNAS-NO4-TAHUN-2018-TENTANG-PELAPORAN-PELAKSANAANPENGELOLAAN-ZAKAT.pdf (accessed Jan. 4, 2023).

[4] BAZNAS "Visi dan Misi BAZNAS" https://baznas.go.id/profil (accessed Jan. 4, 2023).

[5] https://Menara.baznas.go.id (accessed Jan. 4, 2023).

[6] https://literat.republika.co.id/posts/194641/romadhona-wakafkan-aplikasiuntuk-masjid-dan-mushola-di-indonesia (accessed Jan. 4, 2023).

[7] https://www.bisnissyariah.co.id/kecerdasannya-nyaris-mengikuti-habibianak-muda-ini-berhasil-ciptakan-aplikasi-manajemen-masjid (accessed Jan. 4, 2023).

[8] Revolusi Industri 4.0: Mengubah Tantangan Menjadi Peluang di Era Disrupsi 4.0 2019 penerbit genesis savitri astrid

[9] L. Tuggener et al., “Automated machine learning in practice: state of the art and recent results,” in 2019 6th Swiss Conference on Data Science (SDS), pp. 31–36, IEEE.

[10] C. M. Bishop, Pattern recognition and machine learning. Springer, 2006.

[11] A. Dey, “Machine learning algorithms: a review,” International Journal of Computer Science and Information Technologies, vol. 7, no. 3, pp. 1174– 1179, 2016.

[12] M. D. Ganggayah et al., “Predicting factors for survival of breast cancer 2 patients using machine learning techniques,” BMC medical informatics and making, decision, vol. 19, no. 1, p. 48, 2019.

[13] S. S. Ali, M. Shoaib Zafar, and M. T. Saeed, “Effort Estimation Problems in SoftwareMaintenance - A Survey,” 2020 3rd Int. Conf. Comput. Math. Eng. Technol. Idea to Innov. Build. Knowl. Econ. iCoMET 2020, no. March 2021, 2020, doi: 10.1109/iCoMET48670.2020.9073823.

[14] Norman E. Fenton "A Critique of Software Defect Prediction Models" IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 25, NO. 5, SEPTEMBER//OCTOBER 1999

[15] Rizal Setya Perdana, Umi Laili Yuhana"Prediksi Code Defect Perangkat Lunak Dengan Metode Association Rule Mining dan Cumulative Support Thresholds" Vol. 6 No. 2 (2015): Jurnal Buana Informatika Volume 6 Nomor 2 April 2015

[16] Sandeep Reddivari and Jayalakshmi Raman “Software Quality Prediction: An Investigation based on Machine Learning” 2019 IEEE 20 th International Conference on Information Reuse and Integration for Data Science (IRI)

[17] Suad A.Alasadi , Wesam S Bhaya "Review of Data Preprocessing Techinques in Data Mining” Journal of ENginerring and Applied Sciences 12 (16 : 4102-4107, 2017 ISSN 1816-949X

[18] Nadia Tabassum 1 , Abdallah Namoun 2 , Tahir Alyas 3,* , Ali Tufail 4 , Muhammad Taqi 3 and Ki-Hyung Kim " Classification of Bugs in Cloud Computing Applications Using Machine Learning Techniques" Mdpi.com Appl. Sci. 2023, 13, 2880. https://doi.org/ 10.3390/app13052880

[19] J. Singh, S. Bagga, and R. Kaur, “Software-based Prediction of Liver Disease with Feature Selection and Classification Techniques,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 1970–1980, 2020, doi: 10.1016/j.procs.2020.03.226.

[20] M. Emu, F. B. Kamal, S. Choudhury, and T. E. Alves De Oliveira, “Assisting the Non-invasive Diagnosis of Liver Fibrosis Stages using Machine Learning Methods,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. 3 Soc. EMBS, vol. 2020-July, pp. 5382–5387, 2020, doi: 10.1109/EMBC44109.2020.9176542.

[21] H. Wang, Y. Liu, and W. Huang, “The application of feature selection in Hepatitis B virus reactivation,” 2017 IEEE 2nd Int. Conf. Big Data Anal. ICBDA 2017, pp. 893–896, 2017, doi: 10.1109/ICBDA.2017.8078767.

[22] “ISO - ISO_IEC 14764_2006 - Software Engineering — Software Life Cycle Processes — Maintenance.”

[23] F. Rozy, S. Rangkuti, M. A. Fauzi, Y. A. Sari, E. Dewi, and L. Sari, “Analisis Sentimen Opini Film Menggunakan Metode Naïve Bayes dengan Ensemble Feature dan Seleksi Fitur Pearson Correlation Coefficient,” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 12, pp. 6354–6361, 2018.

[24] N. T. Romadloni and Hilman F Pardede, “Seleksi Fitur Berbasis Pearson Correlation Untuk Optimasi Opinion Mining Review Pelanggan,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 3, pp. 505–510, 2019, doi: 10.29207/resti.v3i3.1189.

[25] Md. Sabab Zulfiker a, Nasrin Kabir b, Al Amin Biswas a, Tahmina Nazneen c, Mohammad Shorif Uddin b “An in-depth analysis of machine learning approaches to predict depression” Science Direct, vol. 2 November, 2021, doi: https://doi.org/10.1016/j.crbeha.2021.100044.

[26] Thara D.K. , PremaSudha B.G, Fan Xiong “Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques” Science Direct, volume 128, 1 Desember, 2019 , Pages 544- 550, doi: https://doi.org/10.1016/j.patrec.2019.10.029.

[27] Paul Mooijman a, Cagatay Catal b, Bedir Tekinerdogan a, Arjen Lommen c, Marco Blokland c “The effects of data balancing approaches: A case study” Science Direct, vol. 132 January, 2023, doi: https://doi.org/10.1016/j.asoc.2022.109853

[28] H. Alibrahim and S. A. Ludwig, “Hyperparameter Optimization: Comparing Genetic Algorithm against Grid Search and Bayesian Optimization,” pp. 4 1551–1559, 2021, doi: 10.1109/cec45853.2021.9504761

[29] D. Marinov and D. Karapetyan, “Hyperparameter optimisation with early termination of poor performers,” 2019 11th Comput. Sci. Electron. Eng. Conf. CEEC 2019 - Proc., no. September, pp. 160–163, 2019,

[30] D. Berrar, “Cross-validation,” Encycl. Bioinforma. Comput. Biol. ABC Bioinforma., vol. 1–3, no. January 2018, pp. 542–545, 2018, doi: 10.1016/B978-0-12-809633-8.20349-X

[31] “Mengenal Metode Machine Learning Untuk Sebuah Data Science” https://lp2m.uma.ac.id/2022/06/24/mengenal-metode-machine-learninguntuk-sebuah-data-science/ (accessed Jan. 4, 2023).

[32] S. B. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques,” p.20.

[33] Sara Elmidaoui, Laila Cheikhi, Ali Idri, Alain Abran "Machine Learning Techniques for Software Maintainability Prediction: Accuracy Analysis" JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 35(5): 1147–1174 Sept. 2020. DOI 10.1007/s11390-020-9668-1

[34] R. Saravanan and P. Sujatha, “A State of Art Techniques on Machine Learning Algorithms: A Perspective of Supervised Learning Approaches in Data Classification,” in 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, Jun. 2018, pp. 945–949, doi: 10.1109/ICCONS.2018.8663155.

[35] Srikanta B. Patnaik and Vandana C. Bhattacherjee "Machine Learning and Software Quality Prediction: As an Expert System" I.J. Information Engineering and Electronic Business, 2014, 2, 9-27 Published Online April 2014 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2014.02.02

[36] Barry Boehm, Chris Abts, and Sunita Chulani. Software development cost estimation approaches | a survey. Annals of Software Engineering, 10(1):177{ 205, Nov 2000. 5

[37] Barry W. Boehm. Software Engineering Economics. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1st edition, 1981.

[38] F.J. Heemstra. Software cost estimation. Information and Software Technology, 34(10):627 { 639, 1992.

[39] Fiona Walkerden and R Jeffery. Software cost estimation: A review of models,process, and practice. Advances in Computers, 44:59{125, 12 1997.

[40] Freund, Y. Boosting a Weak Learning Algorithm by Majority. Inf. Comput. 1995, 121, 256–285

[41] Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232

[42] W. Wu, S. Nagarajan, and Z. Chen, “Bayesian machine learning: Eegmeg signal processing measurements,” IEEE Signal Processing Magazine, vol. 33, no. 1, pp. 14–36, 2015.

[43] A. A. Ibrahim, A. I. Hashad, and N. E. M. Shawky, A Comparison of Open Source Data Mining Tools for Breast Cancer Classification, pp. 636–651. IGI Global, 2017.

[44] C.-Y. J. Peng, K. L. Lee, and G. M. Ingersoll, “An introduction to logistic regression analysis and reporting,” The journal of educational research, vol. 96, no. 1, pp. 3–14, 2002.

[45] Min Xu a, Pakorn Watanachaturaporn a, Pramod K. Varshney a, Manoj K. Arora b “Decision tree regression for soft classification of remote sensing data” Science Direct, Volume 97, Issue 3, 15 August 2005, Pages 322-336, doi: https://doi.org/10.1016/j.rse.2005.05.008

[46] Qi Shi, Mohamed Abdel-Aty, Jaeyoung Lee “A Bayesian ridge regression analysis of congestion's impact on urban expressway safety” Science Direct, Volume 88, March 2016, Pages 124-137, doi: https://doi.org/10.1016/j.aap.2015.12.001

[47] Marcos Roberto Machado, Salma Karray, Ivaldo Tributino de Sousa, “LightGBM: an Effective Decision Tree GradientBoosting Method to Predict Customer Loyalty in the Finance Industry” The 14th International 6 Conference on Computer Science & Education (ICCSE 2019) August 19- 21, 2019. Toronto, Canada. DOI : 10.1109/ICCSE.2019.8845529

[48] Rokach, L. Ensemble-based classifiers. Artif. Intell. Rev. 2010, 33, 1–39. . [CrossRef]

[49] Zhou, Z.H. Ensemble Methods: Foundations and Algorithms, 1st ed.; Chapman & Hall/CRC: Boca Raton, FL, USA, 2012

[50] Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [CrossRef]

[51] Aggarwal, C.C. Data Classification: Algorithms and Applications; GoogleBooks-ID: NwQZCwAAQBAJ; CRC Press: Boca Raton, FL,USA, 2015.

[52] D. Kurniawan, Pengenalan Machine Learning dengan Python. PT Elex Media Komputindo, 2020

[53] A. Botchkarev, “Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology,” pp. 1–37, 2018, [Online]. Available: http://arxiv.org/abs/1809.03006.

[54] Hanusz and J. Tarasi«ska, ‘‘Normalization of the Kolmogorov–Smirnov and Shapiro–Wilk tests of normality,’’ Biometrical Lett., vol. 52, no. 2, pp. 85–93, Dec. 2015

[55] Wilcoxon, ‘‘Individual comparisons by ranking methods,’’ Biometrics Bull., vol. 1, no. 6, pp. 80–83, 1945

[56] Bruce Ratner "Journal of Targeting, Measurement and Analysis for Marketing" (2009) 17, 139 – 142. doi: 10.1057/jt.2009.5

[57] TIAGO CARNEIRO 1, RAUL VICTOR MEDEIROS DA NÓBREGA 1, THIAGO NEPOMUCENO, GUI-BIN BIAN, VICTOR HUGO C. DE ALBUQUERQUE,PEDRO PEDROSA REBOUÇAS FILHO "Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications" Published in: IEEE Access ( Volume: 6) Page(s): 61677 - 61685,october 2018 DOI: 7 10.1109/ACCESS.2018.2874767

[58] P. Suresh Kumar, Janmenjoy Nayak, H. S. Behera, " Model-based Software Defect Prediction from Software Quality Characterized Code Features by using Stacking Ensemble Learning" Journal of Engineering Science and Technology Review 15 (2) (2022) 137 - 155

[59] Zhenyu Yang, Chufeng Jin, Yue Zhang, Jingjie Wang, Bingchang Yuan, Heng Li "Software Defect Prediction: An Ensemble Learning Approach" Journal of Physics: Conference Series 2022 doi:10.1088/1742- 6596/2171/1/012008

[60] Yakub Kayode Saheed, Olumide Longe, Usman Ahmad Baba, Sandip Rakshit, Narasimha Rao Vajjhala "An Ensemble Learning Approach for Software Defect Prediction in Developing Quality Software Product" CCIS 2022 pp. 317–326 DOI: 10.1007/978-3-030-81462-5_2

[61] Abdullateef O. Balogun, Fatimah B. Lafenwa-Balogun,Hammed A. Mojeed, Victor E. Adeyemo,Oluwatobi N. Akande, Abimbola G. Akintola, Amos O. Bajeh1,and Fatimah E. Usman-Hamza "SMOTE-Based Homogeneous Ensemble Methods for Software Defect Prediction" DOI: 10.1007/978-3-030-58817-5_45 ICCSA, pp. 615–631, 2020

[62] Sara Adel El-Shorbagy, Wael Mohamed El-Gammal,Walid. M. Abdelmoez "Using SMOTE and Heterogeneous Stacking in Ensemble learning for Software Defect Prediction" ACM 2018 ISBN 978-1-4503-6469-0/18/05 DOI:https://doi.org/10.1145/3220267.3220286

[63] Tarunim Sharmaa , Aman Jatainb , Shalini Bhaskarc and Kavita Pabreja "Ensemble Machine Learning Paradigms in Software Defect Prediction" ScienceDirect, Procedia Computer Science 218 (2023) 199–209 2023

[64] Ran Li, Lijuan Zhou, Shudong Zhang, Hui Liu, Xiangyang Huang, Zhong Su "Software Defect Prediction Based on Ensemble Learning" ACM 2019 ISBN 978-1-4503-7141-4/19/07 https://doi.org/10.1145/3352411.3352412

[65] Ehsan Elahi, Ehsan Elahi, Ali Nouman Asif "A new Ensemble approach for Software Fault Prediction" International Bhurban Conference on Applied 8 Sciences & Technology (IBCAST) 2020

[66] Amal Alazba , Hamoud Aljamaan "Software Defect Prediction Using Stacking Generalization of Optimized Tree-Based Ensembles" https://www.mdpi.com Appl. Sci. 2022, 12, 4577. https://doi.org/10.3390/app12094577

[67] Umair Ali, Shabib Aftab, Ahmed Iqbal, Zahid Nawaz, Muhammad Salman Bashir, Muhammad Anwaar Saeed "Software Defect Prediction Using Variant based Ensemble Learning and Feature Selection Techniques" 2020 MECS (http://www.mecs-press.org/) Modern Education and Computer Science 29-40 DOI: 10.5815/ijmecs.2020.05.03

[68] Somya Goyal "Heterogeneous Stacked Ensemble Classifier for Software Defect Prediction" 2020 IEEE Xplore. International Conference on Parallel

[69] Thanh Tung Khuat, My Hanh Le "Ensemble learning for software fault prediction problem with imbalanced data" 2019 International Journal of Electrical and Computer Engineering (IJECE) Vol. 9, No. 4, August 2019, pp. 3241~3246 ISSN: 2088-8708, DOI: 10.11591/ijece.v9i4.pp3241-3246

[70] Abdullah Alsaeedi, Mohammad Zubair Khan "Software Defect Prediction Using Supervised Machine Learning and Ensemble Techniques: A Comparative Study" Journal of Software Engineering and Applications, 12, 85-100. https://doi.org/10.4236/jsea.2019.125007

[71] Hanif Rahardian, M. Reza Faisal , Friska Abadi , Radityo Adi Nugroho, Rudy Herteno "Implementation of Data Level Approach Techniques To Solve Unbalanced Data Case On Software Defect Classification" International Journal of Electrical and Computer Engineering (IJECE) Vol. 9, No. 4, August 2019, pp. 3241~3246 ISSN: 2088-8708, DOI: 10.11591/ijece.v9i4.pp3241-3246

[72] Thanh Tung Khuat, My Hanh Le "Evaluation of Sampling?Based Ensembles of Classifiers on Imbalanced Data for Software Defect Prediction Problems" Journal of Data Science and Software Engineering Volume 01 No. 1 2020 9

[73] Sugiyono., Metode Penelitian Kuantitatif Kualitatif Dan R&D. Bandung: Alfabeta, 2013.

[74] A. Saifudin, “METODE DATA MINING UNTUK SELEKSI CALON MAHASISWA PADA PENERIMAAN MAHASISWA BARU DI UNIVERSITAS PAMULANG,” J. Teknol., vol. 10, no. 1, pp. 25–36, 2018, [Online].Available:https://www.academia.edu/35836119/Metode_Data_M ining_untuk_Seleksi_Calon_Mahasiswa_pada_Penerimaan_Mahasiswa_B aru_di_Universitas_Pamulang.

[75] https://baznas.go.id/news-show/Aplikasi_Menara_Masjid_BAZNAS_ Bantu_Permudah_Pengelolaan_Masjid/1422?back=https://baznas.go.id/ne ws-all

[76] Alberto S. Nuñez-Varela∗, Héctor G. Pérez-Gonzalez, Francisco E. Martínez-Perez,Carlos Soubervielle-Montalvo "Source code metrics: A systematic mapping study" Journal of Systems and Software Volume 128, June 2017, Pages 164-197 , https://doi.org/10.1016/j.jss.2017.03.044

[77] https://packagist.org/packages/phpmetrics/phpmetrics (accessed Jul. 29, 2023).

[78] https://www.openml.org/search?type=data&sort=runs&id=1063& status =active (accessed Jul. 29, 2023).

[79] Shepperd, M. and Qinbao Song and Zhongbin Sun and Mair, C. (2013) Data Quality: Some Comments on the NASA Software Defect Datasets, IEEE Transactions on Software Engineering, 39.

Detail Informasi

Tesis ini ditulis oleh :

  • Nama : MUHAMMAD ROMADHONA KUSUMA
  • NIM : 14210128
  • Prodi : Ilmu Komputer
  • Kampus : Margonda
  • Tahun : 2023
  • Periode : I
  • Pembimbing : Dr. Windu Gata, M.Kom
  • Asisten :
  • Kode : 0031.S2.IK.TESIS.I.2023
  • Diinput oleh : NZH
  • Terakhir update : 24 Juni 2024
  • Dilihat : 110 kali

TENTANG PERPUSTAKAAN


PERPUSTAKAAN UNIVERSITAS NUSA MANDIRI


E-Library Perpustakaan Universitas Nusa Mandiri merupakan platform digital yang menyedikan akses informasi di lingkungan kampus Universitas Nusa Mandiri seperti akses koleksi buku, jurnal, e-book dan sebagainya.


INFORMASI


Alamat : Jln. Jatiwaringin Raya No.02 RT08 RW 013 Kelurahan Cipinang Melayu Kecamatan Makassar Jakarta Timur

Email : perpustakaan@nusamandiri.ac.id

Jam Operasional
Senin - Jumat : 08.00 s/d 20.00 WIB
Isitirahat Siang : 12.00 s/d 13.00 WIB
Istirahat Sore : 18.00 s/d 19.00 WIB

Perpustakaan Universitas Nusa Mandiri @ 2020