Spammer Detection On Computer Networks Using Gaussian Naïve Bayes Classifier And K-Medoids As Acquisition Training Data
Keywords:
Gaussian Naïve Bayes, Spammer Detection, K-Medoids Clustering, Network Traffic ClassificationAbstract
This research focuses on the implementation of the Gaussian Naïve Bayes algorithm for spammer detection in computer networks, leveraging K-Medoids clustering for training data acquisition. The increasing number of internet users, combined with the challenges of detecting spam activity on a network, has made manual detection ineffective. This study addresses the need for automated spam detection using machine learning algorithms. The Gaussian Naïve Bayes algorithm was chosen for its simplicity and effectiveness in handling continuous data, making it suitable for classifying network traffic as either normal or spammer. To acquire labeled training data, K-Medoids clustering was employed, offering robustness against outliers, which traditional clustering algorithms like K-Means often struggle with. The research involved collecting traffic data from a Mikrotik Routerboard at various intervals, followed by data preprocessing to remove irrelevant or null features. After preprocessing, the data was clustered using K-Medoids into two groups: spammer and normal. The Gaussian Naïve Bayes classifier was then applied to the clustered data, producing a model with high accuracy, precision, recall, and F1-score. Specifically, the model achieved 99.71% accuracy, 100% precision, 99.71% recall, and a 99.85% F1-score, indicating a well-balanced performance in spam detection. The results demonstrate that the Gaussian Naïve Bayes algorithm, combined with K-Medoids clustering, is effective for detecting spammers in computer networks. Future research could explore higher-layer network traffic and broader datasets, utilizing different routers for a more comprehensive evaluation. This approach provides a reliable solution for network administrators seeking to improve network security by detecting and mitigating spam activity.
References
[1] APJII, “Jumlah Pengguna Internet Indonesia Tembus 221 Juta Orang,” Asosiasi Penyelenggara Jasa Internet Indonesia. Accessed: Jun. 06, 2024. [Online]. Available: https://apjii.or.id/berita/d/apjii-jumlah-pengguna-internet-indonesia-tembus-221-juta-orang
[2] F. P. E. Putra, A. Zulfikri, M. A. Huda, Hasbullah, Mahendra, and M. Surur, “Analisis Keamanan Jaringan Dari Serangan Malware Menggunakan Firewall Filtering Dengan Port Blocking,” Digital Transformation Technology (Digitech), vol. 3, no. 2, pp. 857–863, Sep. 2023.
[3] E. Darnila, Z. Yunizar, and D. Gibran Alinda, “INTERNET NETWORK CLASSIFICATION IN MALIKUSSALEH UNIVERSITY USING NAÏVE BAYES METHOD,” METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi, vol. 5, no. 1, pp. 48–53, Apr. 2021, doi: 10.46880/jmika.Vol5No1.pp48-53.
[4] F. Zahra, A. Khalif, and B. N. Sari, “PENGELOMPOKAN TINGKAT KEMISKINAN DI SETIAP PROVINSI DI INDONESIA MENGGUNAKAN ALGORITMA K-MEDOIDS,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 12, no. 2, pp. 1243–1249, 2024, doi: 10.23960/jitet.v12i2.4199.
[5] A. V. Mananggel, lfrina Mewengkang, and A. C. Djamen, “PERANCANGAN JARINGAN KOMPUTER DI SMK MENGGUNAKAN CISCO PACKET TRACER,” Jurnal Pendidikan Teknologi Informasi dan Komunikasi, vol. 1, no. 2, pp. 119–131, Apr. 2021.
[6] A. V. Septiani, R. A. Hasibuan, A. Fitrianto, Erfiani, and A. N. Pradana, “Penerapan Metode K-Medoids dalam Pengklasteran Kab/Kota di Provinsi Jawa Barat Berdasarkan Intensitas Bencana Alam di Jawa Barat pada Tahun 2020-2021,” Statistika, vol. 23, no. 2, pp. 147–155, Nov. 2023, doi: 10.29313/statistika.v23i2.3057.
[7] Rizal, H. A. K. Aidilof, Mukhlis, and K. Nur, “Penerapan Algoritma K-Medoid Dalam Perbandingan Daya Serap Akademik Siswa Sekolah Perkotaan dan Sekolah Pedesaan Selama Masa Pandemi,” TEKNO KOMPAK, vol. 16, no. 2, pp. 85–97, 2022.
[8] U. Linarti, A. Rahmawati, A. Hendri Soleliza Jones, and L. Zahrotun, “Penerapan Metode K-Medoids Guna Pengelompokan Data Usaha Mikro, Kecil dan Menengah (UMKM) Bidang Kuliner Di Kota Yogyakarta,” Jurnal Ilmu Komputer dan Sistem Informasi (JIKOMSI), vol. 7, no. 1, pp. 37–45, 2024.
[9] M. Afriansyah, J. Saputra, V. Yoga Pudya Ardhana, Y. Sa, and U. Qamarul Huda Badaruddin, “ALGORITMA NAIVE BAYES YANG EFISIEN UNTUK KLASIFIKASI BUAH PISANG RAJA BERDASARKAN FITUR WARNA,” Journal of Information Systems Management and Digital Business (JISMDB), vol. 1, no. 2, pp. 236–248, Jan. 2024.
[10] Y. Naufal, R. Putro, A. Afriansyah, and R. Bagaskara, “Penggunaan Algoritma Gaussian Naïve Bayes & Decision Tree Untuk Klasifikasi Tingkat Kemenangan Pada Game Mobile Legends,” JUKI : Jurnal Komputer dan Informatika, vol. 6, no. 1, pp. 10–26, May 2024.
[11] Prashant, “Implementation of Gaussian Naive Bayes in Python Sklearn,” Analytics Vidhya. Accessed: Oct. 24, 2024. [Online]. Available: https://www.analyticsvidhya.com/blog/2021/11/implementation-of-gaussian-naive-bayes-in-python-sklearn/
[12] Rohan Vats, “Gaussian Naive Bayes: What You Need to Know?,” upGrad. Accessed: Oct. 24, 2024. [Online]. Available: https://www.upgrad.com/blog/gaussian-naive-bayes/
[13] D. Sharma et al., “Naive Bayes, Clearly Explained,” Statquest!!! Accessed: Oct. 24, 2024. [Online]. Available: https://statquest.org/naive-bayes-clearly-explained/
[14] N. Muhammad Arofiq, R. Ferdo Erlangga, A. Irawan, and A. Saifudin, “Pengujian Fungsional Aplikasi Inventory Barang Kedatangan Dengan Metode Black Box Testing Bagi Pemula,” OKTAL : Jurnal Ilmu Komputer dan Science, vol. 2, no. 5, pp. 1322–1330, May 2023, [Online]. Available: https://journal.mediapublikasi.id/index.php/oktal
Downloads
Published
Issue
Section
License
Copyright (c) 2024 OK Muhammad Majid Maulana Majid, Rizal Tjut Adek, Zara Yunizar
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright Notice
Authors published in this journal agree to the following terms:
1. The copyright of each article is retained by the author (s).
2. The author grants the journal the first publication rights with the work simultaneously licensed under the Creative Commons Attribution License, allowing others to share the work with an acknowledgment of authorship and the initial publication in this journal.
3. Authors may enter into separate additional contractual agreements for the non-exclusive distribution of published journal versions of the work (for example, posting them to institutional repositories or publishing them in a book), with acknowledgment of their initial publication in this journal.
4. Authors are permitted and encouraged to post their work online (For example in the Institutional Repository or on their website) before and during the submission process, as this can lead to productive exchanges, as well as earlier and larger citations of published work.
5. Articles and all related material published are distributed under a Creative Commons Attribution-ShareAlike 4.0 International License.