Analisis Efektivitas Teknik Imputasi pada LSTM untuk Meningkatkan Kualitas Data pada Peramalan Curah Hujan
- ARIYANTO ADI NUGROHO
- 14210241
ABSTRAK
ABSTRAK
Nama : Ariyanto Adi Nugroho
NIM : 14210241
Program Studi : Ilmu Komputer
Fakultas : Teknologi Informasi
Jenjang : Strata Dua (S2)
Konsentrasi : Data Mining
Judul : “Analisis Efektivitas Teknik Imputasi pada LSTM untuk Meningkatkan Kualitas Data pada Peramalan Curah Hujan”
Data pemantauan iklim yang didapatkan dari stasiun meteorologi dapat memiliki missing value karena berbagai hal. Ketidaklengkapan data dapat terjadi karena transmisi gagal, sensor tidak merespons, perbaikan perangkat, dan lain-lain. Masalah yang didapati umumnya adalah data tidak konsisten dan adanya noise pengukuran data iklim. Diperlukan solusi penanganan missing values pada data cuaca agar dapat diatasi sebelum dilakukan analisis lebih lanjut. Penelitian ini mengusulkan penerapan data imputation pada fase data preparation menyesuaikan karakteristik data. Metode forecasting yang diterapkan adalah LSTM dan Bidirectional LSTM yang merupakan turunan dari RNN. Metode ini menghasilkan model dari data time series yang lebih baik dibanding RNN. Hasil penelitian menyimpulkan metode imputasi yang memiliki performa terbaik adalah KNN dipadukan dengan metode Bidirectional LSTM. Nilai evaluation metric yang diperoleh adalah Mean Absolute Error (MAE) sebesar 3,3599, Mean Square Error (MSE) sebesar 78,4336, Ro
KATA KUNCI
LSTM,Bidirectional LSTM,LOCF,KNN,NOCB
DAFTAR PUSTAKA
DAFTAR REFERENSI
[1] S. Nikfalazar, C.-H. Yeh, S. Bedingfield, and H. A. Khorshidi, “Missing data imputation using decision trees and fuzzy clustering with iterative learning,” Knowl. Inf. Syst., vol. 62, pp. 2419–2437, 2020.
[2] W. Lan, X. Chen, T. Zou, and C.-L. Tsai, “Imputations for high missing rate data in covariates via semi-supervised learning approach,” J. Bus. \& Econ. Stat., vol. 40, no. 3, pp. 1282–1290, 2022.
[3] M. Alabadla et al., “Systematic Review of Using Machine Learning in Imputing Missing Values,” IEEE Access, vol. 10, pp. 44483–44502, 2022, doi: 10.1109/ACCESS.2022.3160841.
[4] M. Alabadla et al., “Systematic Review of Using Machine Learning in Imputing Missing Values,” IEEE Access, vol. 10, pp. 44483–44502, 2022, doi: 10.1109/ACCESS.2022.3160841.
[5] C. Chatfield, The analysis of time series: An introduction. Chapman & Hall/CRC, 2003.
[6] J. F. Torres, D. Hadjout, A. Sebaa, F. Martínez-Álvarez, and A. Troncoso, “Deep Learning for Time Series Forecasting: A Survey,” Big Data, vol. 9, no. 1, pp. 3–21, 2021, doi: 10.1089/big.2020.0159.
[7] J. F. Torres, A. Galicia, A. Troncoso, and F. Martínez-Álvarez, “A scalable approach based on deep learning for big data time series forecasting,” Integr. Comput. Aided. Eng., vol. 25, no. 4, pp. 335–348, 2018, doi: 10.3233/ICA-180580.
[8] N. H. A. Rahman, M. Z. Hussin, S. I. Sulaiman, M. A. Hairuddin, and E. H. M. Saat, “Univariate and multivariate short-term solar power forecasting of 25MWac Pasir Gudang utility-scale photovoltaic system using LSTM approach,” Energy Reports, vol. 9, no. S11, pp. 387–393, 2023, doi: 10.1016/j.egyr.2023.09.018.
[9] F. Wang, Z. Xuan, Z. Zhen, K. Li, T. Wang, and M. Shi, “A day-ahead PV power forecasting method based on LSTM-RNN model and time correlation modification under partial daily pattern prediction framework,” Energy Convers. Manag., vol. 212, p. 112766, 2020, doi: https://doi.org/10.1016/j.enconman.2020.112766. 43 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri
[10] W. Yu, G. Liu, L. Zhu, and W. Yu, “Convolutional neural network with feature reconstruction for monitoring mismatched photovoltaic systems,” Sol. Energy, vol. 212, pp. 169–177, 2020, doi: https://doi.org/10.1016/j.solener.2020.09.026.
[11] E. Afrifa-Yamoah, U. A. Mueller, S. M. Taylor, and A. J. Fisher, “Missing data imputation of high-resolution temporal climate time series data,” Meteorol. Appl., vol. 27, no. 1, p. e1873, 2020.
[12] S. Wahyuddin et al., Data Mining. Global Eksekutif Teknologi, 2023.
[13] D. M. Sinaga, A. P. Windarto, D. Hartama, and S. Saifullah, “Pengelompokkan Indeks Harga Konsumen Menurut Kota Dengan Datamining Clustering,” in Seminar Nasional Sains dan Teknologi Informasi (SENSASI), 2019, vol. 2, no. 1.
[14] K. H. Suradiradja, “Algoritme Machine Learning Multi-Layer Perceptron dan Recurrent Neural Network untuk Prediksi Harga Cabai Merah Besar di Kota Tangerang,” Fakt. Exacta, vol. 14, no. 4, pp. 194–205, 2022.
[15] C. Schröer, F. Kruse, and J. M. Gómez, “A systematic literature review on applying CRISP-DM process model,” Procedia Comput. Sci., vol. 181, pp. 526–534, 2021.
[16] S. Navisa, L. Hakim, and A. Nabilah, “Komparasi Algoritma Klasifikasi Genre Musik pada Spotify Menggunakan CRISP-DM,” J. Sist. Cerdas, vol. 4, no. 2, pp. 114–125, 2021.
[17] A. D. Sidik and A. Ansawarman, “Prediksi Jumlah Kendaraan Bermotor Menggunakan Machine Learning,” Formosa J. Multidiscip. Res., vol. 1, no. 3, pp. 559–568, 2022.
[18] E. E. Klippen, “Forecasting Univariate Time Series with Missing Data,” no. July, 2021.
[19] L. S. Hasibuan and Y. Novialdi, “Prediksi Harga Minyak Goreng Curah dan Kemasan Menggunakan Algoritme Long Short-Term Memory (LSTM),” J. Ilmu Komput. dan Agri-Informatika, vol. 9, no. 2, pp. 149– 157, 2022.
[20] N. Niako, “Effects of Missing Data Imputation Methods on Univariate Time Series Forecasting with Arima and LSTM,” 2023. 44 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri
[21] M. A. Faishol, E. Endroyono, and A. N. Irfansyah, “Predict Urban Air Pollution in Surabaya Using Recurrent Neural Network--Long Short Term Memory,” JUTI J. Ilm. Teknol. Inf, vol. 18, no. 2, p. 102, 2020.
[22] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. neural networks, vol. 5, no. 2, pp. 157–166, 1994.
[23] A. S. Temur, M. Akgün, and G. Temur, “Predicting housing sales in Turkey using ARIMA, LSTM and hybrid models,” 2019.
[24] M. A. Ridla, N. Azise, and M. Rahman, “Perbandingan Model Time Series Forecasting Dalam Memprediksi Jumlah Kedatangan Wisatawan Dan Penumpang Airport,” Simkom, vol. 8, no. 1, pp. 1–14, 2023, doi: 10.51717/simkom.v8i1.103.
[25] P. Nath, P. Saha, A. I. Middya, and S. Roy, “Long-term time-series pollution forecast using statistical and deep learning methods,” Neural Comput. Appl., vol. 33, no. 19, pp. 12551–12570, 2021, doi: 10.1007/s00521-021-05901-2.
[26] D. R. Alghifari, M. Edi, and L. Firmansyah, “Implementasi Bidirectional LSTM untuk Analisis Sentimen Terhadap Layanan Grab Indonesia,” J. Manaj. Inform., vol. 12, no. 2, pp. 89–99, 2022.
[27] W. Sudrajat and I. Cholid, “K-NEAREST NEIGHBOR (K-NN) UNTUK PENANGANAN MISSING VALUE PADA DATA UMKM,” J. Rekayasa Sist. Inf. dan Teknol., vol. 1, no. 2, pp. 54–63, 2023.
[28] A. Latifa, R. Putri, B. Surarso, and T. U. Srrm, “MICE Implementation to Handle Missing Values in Rain Potential Prediction Using Support Vector Machine Algorithm,” vol. 7, no. 4, pp. 1167–1177, 2023.
[29] A. Navlani, A. Fandango, and I. Idris, Python Data Analysis: Perform data collection, data processing, wrangling, visualization, and model building using Python. Packt Publishing Ltd, 2021.
[30] A. T. Nurani, A. Setiawan, and B. Susanto, “Perbandingan Kinerja Regresi Decision Tree dan Regresi Linear Berganda untuk Prediksi BMI pada Dataset Asthma,” J. Sains dan Edukasi Sains, vol. 6, no. 1, pp. 34–43, 2023. 45 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri
[31] A. A. Suryanto and A. Muqtadir, “PENERAPAN METODE MEAN ABSOLUTE ERROR (MEA) DALAM ALGORITMA REGRESI LINEAR UNTUK PREDIKSI PRODUKSI PADI,” SAINTEKBU, vol. 11, no. 1, pp. 78–83, 2019, doi: 10.32764/saintekbu.v11i1.298.
[32] G. James, D. Witten, T. Hastie, R. Tibshirani, and J. Taylor, Springer Texts in Statistics An Introduction to Statistical Learning with Applications in Python. 2023. [Online]. Available: https://link.springer.com/book/10.1007/978-3-031-38747-0#bibliographicinformation
[33] A. Flores, H. Tito, and D. Centty, “Recurrent neural networks for meteorological time series imputation,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 3, pp. 482–487, 2020, doi: 10.14569/ijacsa.2020.0110360.
[34] J. F. Hair, “Multivariate data analysis,” 2009.
[35] R. M. Putra and N. Anjar Rani, “Prediksi Curah Hujan Harian di Stasiun Meteorologi Kemayoran Menggunakan Artificial Neural Network (ANN),” Bul. GAW Bariri, vol. 1, no. 2, pp. 101–108, 2020, doi: 10.31172/bgb.v1i2.35.
[36] F. Hamami and I. A. Dahlan, “Univariate Time Series Data Forecasting of Air Pollution using LSTM Neural Network,” 2020 Int. Conf. Adv. Data Sci. E-Learning Inf. Syst. ICADEIS 2020, pp. 12–16, 2020, doi: 10.1109/ICADEIS49811.2020.9277393.
[37] C. Xie, C. Huang, D. Zhang, and W. He, “Bilstm-i: A deep learning-based long interval gap-filling method for meteorological observation data,” Int. J. Environ. Res. Public Health, vol. 18, no. 19, 2021, doi: 10.3390/ijerph181910321.
[38] H. Ahn, K. Sun, and K. P. Kim, “Comparison of missing data imputation methods in time series forecasting,” Comput. Mater. Contin., vol. 70, no. 1, pp. 767–779, 2021, doi: 10.32604/cmc.2022.019369.
[39] Y. O. Ouma, R. Cheruyot, and A. N. Wachera, “Rainfall and runoff timeseries trend analysis using LSTM recurrent neural network and wavelet neural network with satellite-based meteorological data: case study of Nzoia hydrologic basin,” Complex Intell. Syst., vol. 8, no. 1, pp. 213–236, 46 Program Studi Ilmu Komputer (S2) Universitas Nusa Mandiri 2022, doi: 10.1007/s40747-021-00365-2.
[40] S. Poornima and M. Pushpalatha, “Prediction of rainfall using intensified LSTM based recurrent Neural Network with Weighted Linear Units,” Atmosphere (Basel)., vol. 10, no. 11, 2019, doi: 10.3390/atmos10110668.
[41] Z. Xiang, J. Yan, and I. Demir, “A Rainfall-Runoff Model With LSTMBased Sequence-to-Sequence Learning,” Water Resour. Res., vol. 56, no. 1, pp. 1–17, 2020, doi: 10.1029/2019WR025326.
[42] A. Flores, H. Tito, and C. Silva, “Local average of nearest neighbors: Univariate time series imputation,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 8, pp. 45–50, 2019, doi: 10.14569/ijacsa.2019.0100807.
[43] M. Saad, M. Chaudhary, F. Karray, and V. Gaudet, “Machine Learning Based Approaches for Imputation in Time Series Data and their Impact on Forecasting,” Conf. Proc. - IEEE Int. Conf. Syst. Man Cybern., vol. 2020- Octob, pp. 2621–2627, 2020, doi: 10.1109/SMC42975.2020.9283191.
[44] Y. Hendra, H. Mukhtar, R. Hafsari, and others, “Prediksi Curah hujan di Kota Pekanbaru Menggunakan lSTM (Long Short Term Memory),” J. Softw. Eng. Inf. Syst., pp. 74–81, 2023.
[45] Q. Suo, L. Yao, G. Xun, J. Sun, and A. Zhang, “Recurrent imputation for multivariate time series with missing values,” 2019 IEEE Int. Conf. Healthc. Informatics, ICHI 2019, pp. 1–3, 2019, doi: 10.1109/ICHI.2019.8904638.
[46] M. D. A. Carnegie and C. Chairani, “Perbandingan Long Short Term Memory (LSTM) dan Gated Recurrent Unit (GRU) Untuk Memprediksi Curah Hujan,” J. MEDIA Inform. BUDIDARMA, vol. 7, no. 3, pp. 1022– 1032, 2023.
[47] P. D. BMKG, “FAQ Data Online Pusat Database BMKG,” 2023. https://dataonline.bmkg.go.id/webfaq (accessed Dec. 01, 2023).
[48] I. Ghozali, “Aplikasi Analisis Multivariete dengan Program IBM SPSS 23,” 2016.
Detail Informasi
Tesis ini ditulis oleh :
- Nama : ARIYANTO ADI NUGROHO
- NIM : 14210241
- Prodi : Ilmu Komputer
- Kampus : Margonda
- Tahun : 2023
- Periode : II
- Pembimbing : Dr. Muhammad Haris, S.Kom., M.Eng
- Asisten :
- Kode : 0051.S2.IK.TESIS.II.2023
- Diinput oleh : NZH
- Terakhir update : 09 Juli 2024
- Dilihat : 112 kali
TENTANG PERPUSTAKAAN

E-Library Perpustakaan Universitas Nusa Mandiri merupakan
platform digital yang menyedikan akses informasi di lingkungan kampus Universitas Nusa Mandiri seperti akses koleksi buku, jurnal, e-book dan sebagainya.
INFORMASI
Alamat : Jln. Jatiwaringin Raya No.02 RT08 RW 013 Kelurahan Cipinang Melayu Kecamatan Makassar Jakarta Timur
Email : perpustakaan@nusamandiri.ac.id
Jam Operasional
Senin - Jumat : 08.00 s/d 20.00 WIB
Isitirahat Siang : 12.00 s/d 13.00 WIB
Istirahat Sore : 18.00 s/d 19.00 WIB
Perpustakaan Universitas Nusa Mandiri @ 2020