Spammer Detection On Computer Networks Using Gaussian Naïve Bayes Classifier And K-Medoids As Acquisition Training Data

Authors

  • OK Muhammad Majid Maulana Majid Departement of Informatics
  • Rizal Tjut Adek
  • Zara Yunizar

Keywords:

Gaussian Naïve Bayes, Spammer Detection, K-Medoids Clustering, Network Traffic Classification

Abstract

This research focuses on the implementation of the Gaussian Naïve Bayes algorithm for spammer detection in computer networks, leveraging K-Medoids clustering for training data acquisition. The increasing number of internet users, combined with the challenges of detecting spam activity on a network, has made manual detection ineffective. This study addresses the need for automated spam detection using machine learning algorithms. The Gaussian Naïve Bayes algorithm was chosen for its simplicity and effectiveness in handling continuous data, making it suitable for classifying network traffic as either normal or spammer. To acquire labeled training data, K-Medoids clustering was employed, offering robustness against outliers, which traditional clustering algorithms like K-Means often struggle with. The research involved collecting traffic data from a Mikrotik Routerboard at various intervals, followed by data preprocessing to remove irrelevant or null features. After preprocessing, the data was clustered using K-Medoids into two groups: spammer and normal. The Gaussian Naïve Bayes classifier was then applied to the clustered data, producing a model with high accuracy, precision, recall, and F1-score. Specifically, the model achieved 99.71% accuracy, 100% precision, 99.71% recall, and a 99.85% F1-score, indicating a well-balanced performance in spam detection. The results demonstrate that the Gaussian Naïve Bayes algorithm, combined with K-Medoids clustering, is effective for detecting spammers in computer networks. Future research could explore higher-layer network traffic and broader datasets, utilizing different routers for a more comprehensive evaluation. This approach provides a reliable solution for network administrators seeking to improve network security by detecting and mitigating spam activity.

References

[1] APJII, “Jumlah Pengguna Internet Indonesia Tembus 221 Juta Orang,” Asosiasi Penyelenggara Jasa Internet Indonesia. Accessed: Jun. 06, 2024. [Online]. Available: https://apjii.or.id/berita/d/apjii-jumlah-pengguna-internet-indonesia-tembus-221-juta-orang

[2] F. P. E. Putra, A. Zulfikri, M. A. Huda, Hasbullah, Mahendra, and M. Surur, “Analisis Keamanan Jaringan Dari Serangan Malware Menggunakan Firewall Filtering Dengan Port Blocking,” Digital Transformation Technology (Digitech), vol. 3, no. 2, pp. 857–863, Sep. 2023.

[3] E. Darnila, Z. Yunizar, and D. Gibran Alinda, “INTERNET NETWORK CLASSIFICATION IN MALIKUSSALEH UNIVERSITY USING NAÏVE BAYES METHOD,” METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi, vol. 5, no. 1, pp. 48–53, Apr. 2021, doi: 10.46880/jmika.Vol5No1.pp48-53.

[4] F. Zahra, A. Khalif, and B. N. Sari, “PENGELOMPOKAN TINGKAT KEMISKINAN DI SETIAP PROVINSI DI INDONESIA MENGGUNAKAN ALGORITMA K-MEDOIDS,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 12, no. 2, pp. 1243–1249, 2024, doi: 10.23960/jitet.v12i2.4199.

[5] A. V. Mananggel, lfrina Mewengkang, and A. C. Djamen, “PERANCANGAN JARINGAN KOMPUTER DI SMK MENGGUNAKAN CISCO PACKET TRACER,” Jurnal Pendidikan Teknologi Informasi dan Komunikasi, vol. 1, no. 2, pp. 119–131, Apr. 2021.

[6] A. V. Septiani, R. A. Hasibuan, A. Fitrianto, Erfiani, and A. N. Pradana, “Penerapan Metode K-Medoids dalam Pengklasteran Kab/Kota di Provinsi Jawa Barat Berdasarkan Intensitas Bencana Alam di Jawa Barat pada Tahun 2020-2021,” Statistika, vol. 23, no. 2, pp. 147–155, Nov. 2023, doi: 10.29313/statistika.v23i2.3057.

[7] Rizal, H. A. K. Aidilof, Mukhlis, and K. Nur, “Penerapan Algoritma K-Medoid Dalam Perbandingan Daya Serap Akademik Siswa Sekolah Perkotaan dan Sekolah Pedesaan Selama Masa Pandemi,” TEKNO KOMPAK, vol. 16, no. 2, pp. 85–97, 2022.

[8] U. Linarti, A. Rahmawati, A. Hendri Soleliza Jones, and L. Zahrotun, “Penerapan Metode K-Medoids Guna Pengelompokan Data Usaha Mikro, Kecil dan Menengah (UMKM) Bidang Kuliner Di Kota Yogyakarta,” Jurnal Ilmu Komputer dan Sistem Informasi (JIKOMSI), vol. 7, no. 1, pp. 37–45, 2024.

[9] M. Afriansyah, J. Saputra, V. Yoga Pudya Ardhana, Y. Sa, and U. Qamarul Huda Badaruddin, “ALGORITMA NAIVE BAYES YANG EFISIEN UNTUK KLASIFIKASI BUAH PISANG RAJA BERDASARKAN FITUR WARNA,” Journal of Information Systems Management and Digital Business (JISMDB), vol. 1, no. 2, pp. 236–248, Jan. 2024.

[10] Y. Naufal, R. Putro, A. Afriansyah, and R. Bagaskara, “Penggunaan Algoritma Gaussian Naïve Bayes & Decision Tree Untuk Klasifikasi Tingkat Kemenangan Pada Game Mobile Legends,” JUKI : Jurnal Komputer dan Informatika, vol. 6, no. 1, pp. 10–26, May 2024.

[11] Prashant, “Implementation of Gaussian Naive Bayes in Python Sklearn,” Analytics Vidhya. Accessed: Oct. 24, 2024. [Online]. Available: https://www.analyticsvidhya.com/blog/2021/11/implementation-of-gaussian-naive-bayes-in-python-sklearn/

[12] Rohan Vats, “Gaussian Naive Bayes: What You Need to Know?,” upGrad. Accessed: Oct. 24, 2024. [Online]. Available: https://www.upgrad.com/blog/gaussian-naive-bayes/

[13] D. Sharma et al., “Naive Bayes, Clearly Explained,” Statquest!!! Accessed: Oct. 24, 2024. [Online]. Available: https://statquest.org/naive-bayes-clearly-explained/

[14] N. Muhammad Arofiq, R. Ferdo Erlangga, A. Irawan, and A. Saifudin, “Pengujian Fungsional Aplikasi Inventory Barang Kedatangan Dengan Metode Black Box Testing Bagi Pemula,” OKTAL : Jurnal Ilmu Komputer dan Science, vol. 2, no. 5, pp. 1322–1330, May 2023, [Online]. Available: https://journal.mediapublikasi.id/index.php/oktal

Downloads

Published

2024-12-27