Perbandingan Algoritma Catboost dan Extra Trees Classifier Untuk Prediksi Tingkat Keberhasilan Panen Padi Berdasarkan Faktor Cuaca dan Kondisi Tanah
Keywords:
Machine Learning, Prediksi Panen Padi, CatBoost, ExtraTrees, Explainable AI, SHAPAbstract
Produksi padi merupakan salah satu faktor penting dalam menjaga ketahanan pangan, khususnya di Indonesia yang menjadikan beras sebagai bahan pangan utama. Namun, hasil panen padi sering dipengaruhi oleh berbagai faktor lingkungan seperti curah hujan, suhu, kelembapan, pH tanah, serta kandungan unsur hara tanah seperti nitrogen, fosfor, dan kalium. Ketidakpastian kondisi lingkungan tersebut menyebabkan petani kesulitan dalam memprediksi potensi hasil panen secara akurat. Oleh karena itu, diperlukan suatu sistem yang mampu membantu memprediksi hasil panen padi berdasarkan kondisi lingkungan dan tanah secara cepat dan akurat.
Penelitian ini bertujuan untuk membangun sistem prediksi hasil panen padi berbasis machine learning dengan menggunakan algoritma CatBoost dan ExtraTrees, serta mengimplementasikan Explainable Artificial Intelligence (XAI) menggunakan metode SHAP (SHapley Additive Explanations) untuk memberikan interpretasi terhadap hasil prediksi model. Dataset yang digunakan terdiri dari 1500 data yang memuat variabel curah hujan, suhu, kelembapan, pH tanah, nitrogen, fosfor, dan kalium dengan tiga kelas hasil panen yaitu rendah, sedang, dan tinggi. Data kemudian diproses menggunakan metode train-test split dengan proporsi 80% data pelatihan dan 20% data pengujian.
Hasil penelitian menunjukkan bahwa algoritma CatBoost memiliki performa yang lebih baik dibandingkan ExtraTrees dengan nilai akurasi yang lebih tinggi. Berdasarkan evaluasi menggunakan ROC Curve, model CatBoost memperoleh nilai AUC sebesar 0,99 pada kelas rendah, 0,98 pada kelas sedang, dan 0,99 pada kelas tinggi, yang termasuk dalam kategori excellent classification. Sementara itu, model ExtraTrees memperoleh nilai AUC sebesar 0,96 pada kelas rendah, 0,81 pada kelas sedang, dan 0,93 pada kelas tinggi. Analisis menggunakan metode SHAP menunjukkan bahwa faktor yang paling berpengaruh terhadap hasil prediksi panen adalah nitrogen, pH tanah, dan kelembapan. Sistem yang dibangun juga diimplementasikan dalam bentuk aplikasi web berbasis Laravel yang dapat digunakan untuk melakukan prediksi dan memberikan analisis kondisi lahan secara otomatis.
Dengan adanya sistem ini diharapkan dapat membantu petani atau pengguna dalam mengetahui potensi hasil panen padi serta memahami faktor-faktor lingkungan yang mempengaruhi hasil panen sehingga dapat menjadi alat bantu pengambilan keputusan (decision support system) dalam pengelolaan lahan pertanian.
References
Azhari, M., & Nasution, P. (2025). Application of data mining to determine the performance of family planning field officers (PLKB) using the C4.5 algorithm.
Azhari, M., & Rahman, M. (2022). Analisis perbandingan algoritma WP dan TOPSIS dalam menentukan kandidat peserta lomba kompetensi siswa. IT (Informatic Technique) Journal, 10(1), 42–55. [https://doi.org/10.22303/it.10.1.2022.42-55](https://doi.org/10.22303/it.10.1.2022.42-55)
Azhari, M., Situmorang, Z., & Rosnelly, R. (2021). Perbandingan akurasi, recall, dan presisi klasifikasi pada algoritma C4.5, Random Forest, SVM, dan Naive Bayes. Jurnal Media Informatika Budidarma, 5(2), 640–650. [https://doi.org/10.30865/mib.v5i2.2937](https://doi.org/10.30865/mib.v5i2.2937)
Badshah, A., Alkazemi, B. Y., Din, F., Zamli, K. Z., & Haris, M. (2024). Crop classification and yield prediction using robust machine learning models for agricultural sustainability. IEEE Access, 12, 162799–162813. [https://doi.org/10.1109/ACCESS.2024.3486653](https://doi.org/10.1109/ACCESS.2024.3486653)
Das, A., & Rad, P. (2020). *Opportunities and challenges in explainable artificial intelligence (XAI): A survey*. arXiv. [http://arxiv.org/abs/2006.11371](http://arxiv.org/abs/2006.11371)
Dinh, T., Wong, H., Lisik, D., Koren, M., Tran, D., Yu, P. S., & Torres-Sospedra, J. (2025). Data clustering: A fundamental method in data science and management. Data Science and Management. [https://doi.org/10.1016/j.dsm.2025.08.001](https://doi.org/10.1016/j.dsm.2025.08.001)
Diyanti, Martanto, & Bahtiar, A. (2023). Prediksi hasil panen padi tahun 2023 menggunakan metode regresi linear di Kabupaten Indramayu. Jurnal Informatika Terpadu, 9(1), 18–23.
Elbeltagi, A., Srivastava, A., Cao, X., Bilali, A. E., Raza, A., Khadke, L., & Salem, A. (2025). An interpretable machine learning approach based on SHAP, Sobol, and LIME values for precise estimation of daily soybean crop coefficients. Scientific Reports, 15(1), 1–20. [https://doi.org/10.1038/s41598-025-20386-y](https://doi.org/10.1038/s41598-025-20386-y)
Fang, F., Ventre, C., Li, L., Kanthan, L., Wu, F., & Basios, M. (2020). *Better model selection with a new definition of feature importance*. arXiv. [http://arxiv.org/abs/2009.07708](http://arxiv.org/abs/2009.07708)
Filippi, P., Han, S. Y., & Bishop, T. F. A. (2025). *Addressing the common issues in published studies*.
Foody, G. M. (2023). Challenges in the real-world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient. PLoS ONE, 18(10), 1–27. [https://doi.org/10.1371/journal.pone.0291908](https://doi.org/10.1371/journal.pone.0291908)
Gopi, S. R., & Karthikeyan, M. (2023). Effectiveness of crop recommendation and yield prediction using hybrid moth flame optimization with machine learning. Engineering, Technology and Applied Science Research, 13(4), 11360–11365. [https://doi.org/10.48084/etasr.6092](https://doi.org/10.48084/etasr.6092)
Himayanta, K. L., & Wardhani, D. F. (2025). Prediksi hasil panen padi dengan machine learning. Jurnal Kelitbangan, 13(1), 1–14.
Islam, M. M., Alharthi, M., Alkadi, R. S., Islam, R., & Masum, A. K. M. (2024). Crop yield prediction through machine learning: A path towards sustainable agriculture and climate resilience in Saudi Arabia. AIMS Agriculture and Food, 9(4), 980–1003. [https://doi.org/10.3934/agrfood.2024053](https://doi.org/10.3934/agrfood.2024053)
Jabed, M. A., & Azmi Murad, M. A. (2024). Crop yield prediction in agriculture: A comprehensive review of machine learning and deep learning approaches, with insights for future research and sustainability. Heliyon, 10(24). [https://doi.org/10.1016/j.heliyon.2024.e40836](https://doi.org/10.1016/j.heliyon.2024.e40836)
Jhajharia, K., Sharma, N. V., & Mathur, P. (2025). A machine learning model for crop yield prediction using remote sensing data. 6, 577–590. [https://doi.org/10.47857/irjms.2025.v06i02.03182](https://doi.org/10.47857/irjms.2025.v06i02.03182)
Manurung, D., Zealtiel, B., & Lubis, A. H. (2025). Prediksi produksi tanaman padi di Indonesia dengan menggunakan algoritma Random Forest Regressor. Journal of Computing and Informatics Research, 4(3), 345–352. [https://doi.org/10.47065/comforch.v4i3.2125](https://doi.org/10.47065/comforch.v4i3.2125)
Masahid, M., Dawud, M. Y., & Abryandoko, E. W. (2025). Pengaruh fluktuasi unsur iklim tahunan terhadap produksi padi: Studi empiris berdasarkan data historis. Agrikultura, 36(2), 216–227. [https://doi.org/10.24198/agrikultura.v36i2.63758](https://doi.org/10.24198/agrikultura.v36i2.63758)
Mugemangango, C., Nzabanita, J., Muhoza, D. N., & Cahill, N. D. (2024). Comparative analysis of machine learning models for predicting rice yield: Insights from agricultural inputs and practices in Rwanda. Research on World Agricultural Economy, 5(4), 350–366. [https://doi.org/10.36956/rwae.v5i4.1247](https://doi.org/10.36956/rwae.v5i4.1247)
Nareindra Bayutama Wibisono, S. S. (2025). Crop yield prediction using Random Forest algorithm and XGBoost machine learning model. International Journal of Research and Innovation in Social Science, 8, 1175–1189.
Nikhil, U. V., Pandiyan, A. M., Raja, S. P., & Stamenkovic, Z. (2024). Machine learning-based crop yield prediction in South India: Performance analysis of various models. Computers, 13(6). [https://doi.org/10.3390/computers13060137](https://doi.org/10.3390/computers13060137)
Nizami, T., Mustaqiim, M. A., & Ariannor, W. (2025). Analisis kinerja model machine learning dalam prediksi gagal panen gabah. Progresif: Jurnal Ilmiah Komputer, 21(1), 184–193. [https://doi.org/10.35889/progresif.v21i1.2501](https://doi.org/10.35889/progresif.v21i1.2501)
Nuraini, D., Violina, D., Anamisa, D. R., Khotimah, B. K., Jauhari, A., & Mufarroha, F. A. (2025). Prediksi hasil panen padi dengan metode multiple linear regression dan particle swarm optimization untuk meningkatkan produksi padi di Madura. JUSIFOR: Jurnal Sistem Informasi dan Informatika, 4(1), 1–8. [https://doi.org/10.70609/jusifor.v4i1.5857](https://doi.org/10.70609/jusifor.v4i1.5857)
Petropoulos, T., Beinos, L., Berruto, R., Miserendino, G., Marinoudi, V., Busato, P., Zisis, C., & Bochtis, D. (2025). *Interpretable machine learning for legume yield prediction using satellite remote sensing data*.
Purwaningrum, Y., Asbur, Y., Atmaja Nasution, S., & Nuh, M. (2025). Pengaruh perubahan iklim terhadap produktivitas tanaman padi gogo (*Oryza sativa* L.). AGRILAND Jurnal Ilmu Pertanian, 13(1), 19–23.
Quille-Mamani, J., Ramos-Fernández, L., Huanuqueño-Murillo, J., Quispe-Tito, D., Cruz-Villacorta, L., Pino-Vargas, E., Flores del Pino, L., Heros-Aguilar, E., & Ángel Ruiz, L. (2025). Rice yield prediction using spectral and textural indices derived from UAV imagery and machine learning models in Lambayeque, Peru. Remote Sensing, 17(4). [https://doi.org/10.3390/rs17040632](https://doi.org/10.3390/rs17040632)
Razavi, M. A., Nejadhashemi, A. P., Majidi, B., Razavi, H. S., Kpodo, J., Eiswaran, R., Ciampitti, I., & Prasad, P. V. V. (2024). Enhancing crop yield prediction in Senegal using advanced machine learning techniques and synthetic data. Artificial Intelligence in Agriculture, 14, 99–114. [https://doi.org/10.1016/j.aiia.2024.11.005](https://doi.org/10.1016/j.aiia.2024.11.005)
Saha, S., Kucher, O. D., Utkina, A. O., & Rebouh, N. Y. (2025). Precision agriculture for improving crop yield predictions: A literature review. [https://doi.org/10.3389/fagro.2025.1566201](https://doi.org/10.3389/fagro.2025.1566201)
Shawon, S. M., Eima, F. B., Mahi, A. K., Niha, F. L., & Zubair, H. T. (2025). Crop yield prediction using machine learning: An extensive and systematic literature review. Smart Agricultural Technology, 10, 100718. [https://doi.org/10.1016/j.atech.2024.100718](https://doi.org/10.1016/j.atech.2024.100718)
Sweeit, L., Müller, C., Anand, M., & Zscheischler, J. (2023). Cross-validation strategy impacts the performance and interpretation of machine learning models. Artificial Intelligence for the Earth Systems, 2(4), 1–14. [https://doi.org/10.1175/aies-d-23-0026.1](https://doi.org/10.1175/aies-d-23-0026.1)
Tasneem, K. T., Shahzad, M. U., Rashid, J., Othman, K. M., Zafar, T., & Faheem, M. (2025). Predicting rice yield and impact of climate change on rice production using machine learning models. Theoretical and Applied Climatology, 156(12). [https://doi.org/10.1007/s00704-025-05912-2](https://doi.org/10.1007/s00704-025-05912-2)
van Klompenburg, T., Kassahun, A., & Catal, C. (2020). Crop yield prediction using machine learning: A systematic literature review. Computers and Electronics in Agriculture, 177, 105709. [https://doi.org/10.1016/j.compag.2020.105709](https://doi.org/10.1016/j.compag.2020.105709)
Yenkikar, A., Mishra, V. P., Bali, M., & Ara, T. (2025). An explainable AI-based hybrid machine learning model for interpretability and enhanced crop yield prediction. MethodsX, 15, 103442. [https://doi.org/10.1016/j.mex.2025.103442](https://doi.org/10.1016/j.mex.2025.103442)
Yuan, J., Zhang, Y., Zheng, Z., Yao, W., Wang, W., & Guo, L. (2024). Grain crop yield prediction using machine learning based on UAV remote sensing: A systematic literature review. Drones, 8(10). [https://doi.org/10.3390/drones8100559](https://doi.org/10.3390/drones8100559)
Yunis, R., Sudarto, & Adiputra Pardosi, I. (2024). Enhancing rice production prediction: A comparative machine learning analysis of climate variables. Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), 13(1), 91–104. [https://doi.org/10.23887/janapati.v13i1.71527](https://doi.org/10.23887/janapati.v13i1.71527)





